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A negative transaiption regulatory element (a constitutive suppressor) b^een residues -250 and - 190 of the 5' flanking re- 
gion of the mouse serglydn gene, a positive (hematopoietic ceU enhancer) regulatory el^ent located between residues -118 and 
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ments and hosts transformed therewith, are described. Th^ regulatory elements, vectors and hosts are useful in gene transmp- 
tion of heterologous genes in oikaryotic cells, and espedally in hematopoi^c cells. In addition, transmptional factors that bind 
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TTTLE OF THE INVENTION 

HEMATOPOIETIC CELL SPECIFIC TRANSCRIPTIONAL 
REGULATORY ELEMENTS OF SERGLYCIN 
AND USES THEREOF 



5 Cross Refe rence to Related Applications 

niis application is a continuation-in-part of U.S. Application No. 
07/816^9, med Januaiy 3, 1992, which is a continuation-in-part of U.S. 
Application No. 07/635,544. filed Januaiy 18, 1991, which is the U.S. 
National Phase of PCT/US89/D3051, filed July 13, 1989, which is a 
10 contmuation-in-part of U.S. Application No. 07/224,035, filed July 13, 
1988, now abandoned; the contents of each of these prioriiy applications 
are fiiUy incorporated herein by reference. 

Statement of Government Interest 
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The research underlying this invention was supported with U.S. 
government funds; the U.S. government has certain rights in this 
invention. 



Field of the Invention 

The invention is in the area of recombinant DNA technology. 
Specifically, the invention is directed to a hematopoietic cell-specific 
transcriptional enhancer element, a transcriptional suppressor element, 
and a promoter element, all present in the 5' flanking region of the 
serglydn gene; the invention is further directed to recombinant vectors 
containing such elements, hosts transfonned with such vectors, and the 
use of vectors and hosts for recombinant gene transcription. The invention 
is also directed to purified protein factors that specifically bind to the 
transcriptional elements of the invention. 



BACKGROUND OF THE INVENTION 



Hie rat connective tissue mast cell was the first celt conclusively 
shown to store proteoglycans in an intracelhilar searetoiy granule 
compartment (Benditt et oL, J. HiMchenu C^todiem 4:419 (1956)). Rat 
connective tissue mast cells contain up to 25 pg/cell of an addicalty 
charged —750 kDa (kilodalton) proteoglycan that possesses a very small 
peptide core to which approximately seven heparin glycosaminoglycans of 
75-100 kDa are attached (Yurt ei oL, /. BioL Chenu 25251% (1977); 
Robinson et aL, L BioL Chem. 25J:6687 (1978); Metcalfe ei oL, J, BioL 
Chenu 255:11753 (1980)). Because the peptide core of mature rat heparin 
proteoglycan consists almost entirely of equal amounts of serine and 
glydne (Robinson et oL, J. BioL Chem. 253:6697 (1978); Metcalfe ef aL, 
L BioL Chem. 255:11753 (1980)) and because heparin glycosyaminoglycan 
is O-g^tycosidicalfy Imked to ^rine at serine-glycine sequences within its 
peptide core ^dahl et oL, /. BioL Chem. 240'2&n (1965)), it was first 
postulated by Robinson and coworkers (Robinson et oL, J. BioL Chem, 
255:6687 (1978)) that the mature peptide core of this proteoglycan is 
predominantly an alternating sequence of serine and glycine. 

It is now known that many cells of hematopoietic origin (including 
serosal mast cells, mucosal mast cells, basophils, natural killer cells, 
cytotoxic T lymphocytes, eosinophils, macrophages, and platelets) store a 
fajoSy of proteoglycans in a cytoplasmic granule compartment that is 
distinct firom the plasma membrane-localized and extracellular 
matrorlocalized families of proteoglycans (Stevens eioL, Cur, Topics 
MicrobioLImmunoL 2-^0:93-108 (1988)). These intracellular proteoglycans 
(known as "serine-glycine rich proteoglycans," "SG-PG," "secretoiy granule 
proteoglycan," or "serglycin proteoglycans") have five to seven highly 
sulfated glycosaminoglycans attached O-glycosidicaUy to a common 18,600 
to 16,700 peptide core possessing a protease-resistanl 

glycosaminoglycan attachment region that is a repeat of serine and glycine 
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amino adds (Yurt el al.. J. BioL Chem. 252:518-521 (1977); Robinson 
et aL, J. BioL Chem. 255:6687-6693 (1978); Razin et at. J. BioL Chem. 
257:7229-7236 (1982); Stevens etaL, J. BioL Chem. 260:14194-14200 
(1985); Seldin et aL, J. BioL Chem. 260:11131-11139 (1985); Bourdon 
et aL, Ptoa NatL Acad. ScL USA 52:1321-1325 (1985); Bourdon et aL, J. 
BioL Chem. 261:12534-12537 (1986); Avraham et aL, J. BioL Chem. 
263:7292-7296 (1988); Avraham et aL, Proc NatL Acad. ScL 56:3763-3767 
(1989); Stevens et aL, J. BioL Chem. 2d?:7287-7291 (1988); Alliel et aL. 
FEBS Lea. 2J6:123-126 (1988); Stellrecht et al.. Nuc Acids Res. 77:7523 
(1989)). The peptide core of this family of proteoglycans has also been 
referred to by a variety of names, such as "secretoiy granule proteoglycan 
peptide core protein," but most recently has simply been called 
"seiglycin." Thus, the gene encoding this peptide is the sergtydn gene. 

Serglycin proteoglycans (serglycin with attached 
glycQsammoglycans) are stored inside cells as a macromolecular complex 
bound to basically charged proteins. Because these proteoglycans are 
bound by ionic linkage in the secretory granules of mouse and rat mast 
cells to positively charged endopeptidases and exopeptidases that are 
enzymatically active at neutral pH, it has been assumed that the serglydn 
proteoglycans prevent intragranular autolysis of the proteases. The 
proteoglycan^rotease macromolecular complexes remain intact when they 
are racocytosed from activated mast cells (Schwartz el aL. J. ImmunoL 
226:2071-2078 (1981); Serafin et aL. J. BioL Chem. 2di: 15017-15021 
(1986); Serafin et aL, J. ImmunoL /JP:3771-3776 (1987); U Trong et at.. 
Proc. NatL Acad ScL USA 84:364-367 (1987)), presumably attenuating 
diffusion of the proteases from inflammatory sites and facilitating 
concerted proteolysis of protein substrates. 

cDNAs that encode serglydn have been isolated from rat (Bourdon 
et aL, Proc. Nad. Acad ScL USA 52:1321-1325 (1985); Bourdon et aL. J. 
BioL Chem. 267:12534-12537 (1986); Avraham et aL. J. BioL Chem. 
265:7292-7296 (1988)), mouse (Avraham et aL, Proc. NatL Acad. ScL 
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86-37^3767 (1989)), and human (Stevens et oL, L Biol Chem. 263:7287- 
7291 (1988); ADid et aL, FEBS Lett 236:123-126 (1988); Stellrecht et oL, 
Nua Adds Bes. 17:7523 (1989)) cDNA libraries. These cDNAs encode 
L0-, L0-, and L3-kb transcripts in the mouse, rat, and human, 
respectively. The mouse serglydn gene resides on chromosome 10, is 
approximately 15 Id> in size, and consists of three exons (Avraham et oL, 
Proa NatL Acad. ScL 863763-3767 (1989)). . 

Bourdon and coworkers (Bourdon et oL, Froa Nad, Acad ScL USA 
82:1321 (1985); Bourdon el oL, /. BioL Chenu 261:12534 (1986)) isolated 
and characterized a cDNA from a rat yolk sac tumor cell that encoded an 
unusual 18.6 kDa proteoglycan peptide core with a 49 amino add 
glycosamino^ycan attachment region of alternating serine and glycine. 
Because of the preponderance of these two ammo adds, it was proposed 
that tiie peptide core of this proteoglycan (designated serglydn) was 
related to the peptide core of rat mast cell-derived heparin proteoglycan. 
Numerous molecular biology studies have been carried out on &e cDNAs 
and genes that encode mouse, ra^ and human serglydn. Using a 3' gene- 
specific fragment of a rat serglydn cDNA (Avraham et oL, /. BioL Chem. 
263:7292 (1988)), it was demonstrated that this gene is expressed at 
relativefy high levels in a variety of mouse and rat mast cells irrespective 
of what type of gjycosaminoglycan is polymerized onto the peptide core 
(Tantravahi et aL, Proc Nad, Acad ScL USA 83:9207 (1986)). This gene 
is also expressed in many other hematopoietic cells that possess secretory 
granules (Tantravahi et al^ Proa Nad. Acad ScL USA 83:9207 (1986); 
Stevens et aL, /. Immunol 139:S63 (1987); Stevens el oL, /. BioL Chenu 
263*J2Sn (1988); Rothenberg, M. E., Pomerantz, J.L, Owen, W.F^ 
Avraham, S., Soberman et al,, /. BioL Chem. 263:13901 (1988); Stelhecht 
et aL, Nucleic Acids Res. 77:7523 (1989); Perin ei oL, Biochem. /. 
255:10017-1013 (1988); MacDermottef aL, /. Exp. Med 162:1771 (1985); 
hBcodemus et aL, L BioL Chem, 265:5889 (1990)) and it appears that the 
same peptide core is used in all of these cell types. The selection of the 
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type of glycosaminoglycan that will be synthesized onto this peptide core 
therefore appears to be a cell-specific event that is not exclusively 
dependent on the translated peptide core. 

Although serglydn is specifically expressed in hematopoietic cells, 
no tissue specific hematopoietic cell transcriptional regulatory elements 
have yet been identified. A need exists for such elements as they would 
allow, for the first time, the regulated induction or expression of 
recombinant genes in hematopoietic cells, especially in hematopoietic cell 
culture systems. 



10 SUMMARY OF THE INVENTIQN 

Recognizing the importance of understanding tissue specific gene 
egression in hematopoietic cells for the expression of recombinant genes 
in cells of hematopoietic cell linage, and cognizant of the need for DNA 
regulatory elements or motife capable of specifically stimulating or 

15 inhibiting transcription for the controOed expression of genes in such cells, 
the inventors investigated the 5' flanking region of the serglydn gene in 
an attempt to identify such motifs. These studies have culminated in the 
identification of three motifs in the 5' flanking region of the mouse 
serglydn gene tiiat regulate the constitutive transcription of that gene. 

20 According to the invention, there is first provided, in isolated form, 

a genetic sequence of approximately tfie proximal 500 nudeotides of tiie 
5' flanking region of human and mouse serglydn gene, such flanking 
region providing transcriptional regulator)' elements suffident to direct 
expression of operably linked recombinant genes in hematopoietic host 

25 cells in a constitutive manner. 

TTie invention further provides, in isolated form, genetic sequences 
encoding a positive transcriptional regulatory element, herein termed an 
enhancer element, such element corresponding to nucleotides -118 
through -81 of the mouse serglycin gene (5 - 
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ID No. 16]), and an equivalent position of the human serglydn gene, and 
such elCTient being dommantly active to stimulate transcription of 
operably Imked genes in hematopoietic host cells. 
5 The invention further provides, in isolated fonn, genetic sequences 

encoding a um'que and atypical eukaryotic promoter element, such 
promoter element correspondmg to nudeotides -40 through -20 of the 
mousesergfycin gene (5'-GAACCTCTITCn7WUW^GGGA<>3 [SEQ ID 
No- 17], and an equivalent position of the human serglydn gene, and such 

10 element being dommantly active for the promotion of transciption in 
op^bly linked genes in hematopoietic host cells. 

The invention further provides, in isolated form, genetic sequences 
encoding a negative transcriptional regulatory element, herein termed a 
suppressor element, such element corresponding to nudeotides -250 

15 through -190 of the mouse serglycin gene (5 - 
TGCAAATGACAGATGGCAGA GCnTITGGAAAAAGAAAAAA 
TAATAACCACACAGCAAACG-3 ' [SEQ ID No. 18], and an equivalent 
position of the human serglycin gene, and such element being dominantly 
a(^e to inhibit transcription of operably Unked genes in fibroblast host 

20 cells. 

The invention further provides expression vectors containing such 
genetic sequences, such expression vectors providing such genetic 
sequences in a manner that permits a gene of interest to be operably 
linked to the regulatory element encoded by the genetic sequence such 
25 that transcriptional expression of the gene of interest is under the control 
of the genetic sequence of the invention. 

The invention further provides expression vectors containing such 
genetic sequences operably linked to a gene of interesL 

Tie invention further provides host cells transfonned with such 
30 expression vectors. 
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The invention further provides methods for the production of a 
peptide of interest, or for inhibiting the production of a peptide of 
interest, using the geneticaHy engineered genetic sequences, vectors and 
hosts of the invention. 
5 The invention further provides methods for the inhibition of the 

expression of a gene of interest, using the geneticaHy engineered genetic 
sequences of the invention to direct the transcription of an anti-sense 
RNA complementary to the gene of interest. 

The invention further provides cell-free preparations of B/F(,250/- 
10 161)''* ^ trans^zcting factor extractable from the nuclei of rat basophilic 
leukemia- 1 cells and rat-1 fibroblasts, such fector specifically binding to 
the suppressor element of the invention. 

The invention further provides cell-free preparations of B/Fj.250/. 
161)"^ al/om-acting factor extractable from the nuclei of rat-1 fibroblasts, 
IS such factor specifically binding to the suppressor element of the invention. 

The invention further provides cell-free preparations of B^^i^jg^^^j-I, 
a mzn^-acting factor extractable from the nuclei of rat basophilic 
leukemia-1 cells, such factor specifically binding to the enhancer element 
of the invention. 

20 Hie invention further provides cell-free preparations of F^^^ 

z trans-acting factor extractable from the nuclei of rat-1 fibroblast cells, 
such factor specifically binding to the enhancer element of the invention. 

The invention further provides cell-free preparations of B/F^. 
40/+24)~^ ^ trans-acting factor extractable from the nuclei of rat basophilic 

25 leukeraia-1 cells and rat-1 fibroblast cells, and to F(-4o/+24)"^^' ^(-40/+ 24)" 
II, mz/iy-acting factors extractable from the nuclei of rat-1 fibroblast cells 
or rat basophilic leukemia-1 cells, respectively, such factors specifically 
binding to the enhancer element of the invention. 
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BRTKF DESCP IPTTON OF THE FIGURES 

Kpme 1. Restriction map and nucleotide sequencing strategy of 
cDNArH4 and its related four cDNAs. Hie cDNA-H4 originated from 
HL^ cells (a promyelotytic leukemia ceU line). "X" and "A" refer to the 

5 sites within each cDNA which are susceptible to Xmnl and AccJ, 
respectively. The arrows indicate the direction and length of each 
subdoned fragment of cDNA that was sequenced. 

Figure 2. Consensus nucleotide sequence of the HL-60 cell-derived 
cDNAs and the predicted amino add sequence of the translated 

10 proteo^can peptide core (serglydn) [SEQ ID Nos. 9 and 10]. Hie arrow 
indicates the putative site of cleavage of the signal peptide. Stop codons 
are indicated by *** The number on the right and left indicate the 
ammo add and the nudeotide in the respective sequence. The Xmnl and 
Accl restriction sites are indicated. The 5' end of the cDNA-H12 was 4 

15 bp longer, and the 5' end of cDNA-H19 was 14 bp shorter than dDNA- 
H4. cDNA-HS differed from the cDNA-H4 in that it had an extra 
thymidine (shown in parentheses) at the 3' end of its cDNA 

Figure 3. Restriction map of the human serglydn gene isolated 
from a human leukocyte genomic DNA library (Klickstein, L.B. ei aL. J. 

20 Exp. Med. 155:1095-1112 (1987)). 

Rgure 4. A Nudeotide sequence of the human serglycin gene 
[SEQ ID Nos. 11 (nucleottde sequence and 12 (protein sequence)]. The 
nudeotide sequences of the 5' flanking region, the exon/intron junctions, 
and the three exons are depicted. The hydrophobic signal peptide of the 

25 translated proteoglycan peptide core in exon 1 and tiie serine-glydne rich 
glycosaminoglycan attachment region in exon 3 are boxed. The 
polyadenyiation site in exon 3 is underlined. 

B. The complete nucleotide sequence of the human 
serglydn gene, including introns [SEQ ID No. 15]. 
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Fipire 5 . Nucleotide sequence of the mouse serglycin gene [SEQ 
ID Nos. 13 (nudeotide sequence and 14 (protein sequence)], Avraham, 
S. et oL, /. BioL Chenu 264:16719-16726 (1989). The nucleotide sequence 
of fhe S' flanking region, the exon^tron junctions, and the three exons 
5 are depicted. The arrow indicates the probable transoiption-initiation 
site. He hydrophobic signal peptide of the translated proteoglycan 
peptide core is boxed in exon 1. The di*acidic amino add sequence that 
has been proposed to dictate glycosaminoglycan addition to proteins and 
the serine-glycine rich, glycosaminoglycan attachment region are boxed in 

10 exon 3. The polyadenylation site in exon 3 is underlined. 

Figure 6, ELISA of the rabbit anti-peptide 02 serum. Peptides 01 
(o-o) and 02 (*— •) were coupled to separate microtiter wells and 
different dilutions of the rabbit anti-peptide 02 sera were examined for 
their reactivity against the specific peptide as detected spectro- 

15 photometrically at 490 nm after the addition of horseradish peroxidase- 
conjugated goat anti-rabbit antibody followed by 2,2'azino-di-(3-ethyl- 
benzthiazoUne] sulfonate. The amino acid sequence of peptide 01 is Ser- 
Val-Gln-Gly-Tyr-Pro-Thr-Gln-Arg-Ala-Arg-Tyr-Gln-Trp-Val-Arg. The 
amino add sequence of peptide 02 is Ser-Asn-Lys-Ile-Pro-Arg-Leu-Arg- 

20 Thr-Asp-Leu-Phe-Pro-Lys-Thr-Arg. 

Figure 7 . SDS-PAGE analysis of immunopredpitates of lysates of 
pS]sulfate-labeled (A) and f^^Sjmethionine-labeled (B) HL-60 cells. (A) 
Lysates of [^^SJsulfate-labeled HL-60 cells were analyzed before (lane 1) 
and after immunoprecipitation with anti-peptide 02 IgG in the presence 

25 (lane 2) or absence (lane 3) of peptide 02. (B) Lysates of HL-60 cells 
were analyzed after a 2 min (lane 1) or a 10 min (lane 2) incubation with 
(^^SJmethionine by immunoprecipitation with anti-peptide 02 IgG. Ten 
min ["^^SJmethionine-labeled HL-60 cells were washed and then incubated 
for an additional 5 min in methionine-containing enriched medium before 

30 lysates were immunoprecipitated with anti-peptide 02 IgG (lane 3). 
Lysates of 5 min labeled HL-60 cells were immunoprecipitated with ami- 
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pq)tide 02 IgG in the presence of peptide 01 (lane 4) or peptide 02 (lane 
5). The [^SJmethionine-labeled proteins that were nonspecificaUy 
unmiinoprecqpitated with pre*immune IgG are depicted in lane 6. Tlie 
origin (on) and the markers are indicated on the far lefi: and right of 
each panel Hie arrows indicate the precursor peptide core and the 
mature proteogfycan that are nnmunopredpitated with anti-pq)tide 02 
IgG. 

Figure 8- A. Effects of progressive deletion of the 5' flanb'ng 
region of the mouse serglycin gene on its abihty to direct human growth 
honnone (hGH) expression in transfected cells. The solid, bold horizontal 
lines (■■BBi) represent the various lengths of the 5' flanking region of 
the mouse serglydn gene ligated to p(^GH» a plasmid that contains a 
promoterless hGH gene (□□□□□). The negative and positive 
numbers in the various constructs refer to the length of the nucleotide 
sequence that extends upstream and downstream, respectively, of the 
transcription-initiation site of the gene* In each experiment, the amount 
of hGH was quantitated 4 d after transfection of rat basophilic leukemia-1 
(RBL*1) cells and rat-I fibroblasts with the specific plasmid construct 
The numbers on the right are the hGH values obtained relative to the 
same population of cells transfected with the control plasmid construct, 
pXGHS. Hie indicated hGH activities represent the mean + SD of data 
from 5 to 18 experiments. ND, not determined. 

B. Blot analysis of hGH mRNA in rat basophih'c leukemia-1 
cells and rat-1 fibroblasts transfected with different DNA constructs. 
RNA blots containing approximately equal amounts of total RNA (10 
/(g/sample) from rat basophilic leukemia-1 cells and rat-1 fibroblasts 
transfected with pPG(-504/+24)hGH (lane 1), pPG(-118/-f24)hGH (lane 
2), pPG(-40/+24)hGH (lane 3), p^GH (lane 4), pSV40-hGH (lane. 5), or 
pXGHS (lane 6) were probed with a %-labeled hGH cDNA, The arrows 
on the right indicate the migration positions of 2.0 kb rRNA, hGH 
mRNA, and ^-actin mRNA. pSV40-hGH vks obtained by Dr. J. Sand, 



Brigbam and Women's Hospital and Harvard Medical School, Boston, 
MA. 

Fipire 9 . Effect of two 5' flanking regions of the mouse serglydn 
gene on its ability to enhance and suppress hGH production in cells 
transfected with plasm id constructs that contain an enhancerless SV40 
early promoter. Hie hatched lines (□□□□□) and the round-dot lines 
(G O Q O) represent the structural sequences of the hGH gene and SV40 
early promoter, respectively, within the plasmid construct The solid, bold 
horizontal lines (mhbbh) represent the specific parts of the 5' flanking- 
region of the mouse serglycin gene that is inserted upstream of the SV40 
promoter in pSV40-hGH. The numbers on the right are the hGH values 
obtained at 4 d relative to those cells transfected with the control plasmid, 
pSV40-hGH. Hie indicated hGH activities represent the mean ± SD 
values of data from 5 to 6 experiments of 4-d duration, with each 
e^eriment performed on 2-3 replicate dishes of cells. 

Fi|gure 10 . Detection of tronj-acting factors in the nudeus of rat 
basophilic leukemia-1 cells (RBL-1) and rat-1 fibroblasts (Rat-1 Fib.) that 
bind cir-acting elements in the putative suppressor region of the 5' 
flanking region of the mouse serglycin gene. Gel mobility shift assays 
were performed with the diagrammatically depicted nucleotide sequence 
in the 5' flanking region of the mouse serglycin gene. In lane 1, 1 ng of 
the ^^P-labeled DNA fragment (residues -250 to -161) was 
electrophoresed in the gel in the absence of nuclear extracts. In lanes 2 
to 4 and lanes 5 to 7, the probe was incubated before electrophoresis with 
nuclear extracts from rat basophilic leukemia-1 cells and rat-1 fibroblasts, 
respectively. Competition assays were performed using 5 ng of the same 
nonradioactive DNA probe (lanes 3 and 6) or 100 ng of sonicated salmon 
sperm DNA (lanes 4 and 7). The probe and the Owiy-acting factors 
present in fibroblasts (F(.25o/.i6i)-" and B/F(.250/-I6ir0 and rat basophilic 
leukemia.-l cells (B/Fj.25o/.i6i)-I) are indicated on the right. 
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Figure 11. Detection of arms-acting factors in the nucleus of rat 
basophilic leukemia*! cells and rat-1 fibroblasts that bind cif*acting 
elements ia the putative enhancer region of the 5' flanking region of the 
mouse seigtyda gene. Gel mobiUty shift assays were performed with the 
5 diagranmiaticany depicted nudeotide sequence in the 5' flanking region 
of the mouse serglycxn gene. In lane 1, 1 ng of the ^^P-Iabeled DNA 
fragment (residues -118 to ^1) was electrophoresed in the gel in the 
absence of nudear extracts. In lanes 2 to 4 and lanes 5 to 7, the probe 
was incubated before electrophoresis with nudear extracts from rat 
10 basophilic leukemia-1 cells and rat-1 fibroblasts, respectively. Competition 
assays were performed using 5 ng of the same nonradioactive DNA probe 
(lanes 3 and 6) or 100 ng of sonicated salmon sperm DNA (lanes 4 and 
7). The probe, nonspedfic (ns) bound probe, and themm^-acting factors 
present in fibroblasts (F^.na/^i)*I) basophilic leukemia-1 cells 

15 ^(.ii8Wi)*I) indicated on the righL 

Hgure 12, Role of residues -28, -30, and -38 in the proximal 
promoter region of the mouse serglycin gene in its interaction with mmi- 
acting factors in the nuclei of rat basophilic Ieukemia-1 cells and rat-1 
fibroblasts. Gel mobiUty shift assays were performed with the 
20 diagrammatically depicted 64 bp nucleotide sequence in the 5' flanking 
region of the mouse serglycin gene (residues -40 to +24) prepared with 
and without point mutations. In lane 1, 1 ng of the ^^P-labeled 
oligonucleotide was electrophoresed in the gel in the absence of nudear 
extracts. In lanes 2 to 6 and lanes 7 to II, the probe was incubated 
25 before electrophoresis with nudear extracts from rat basophilic leukemia-1 
cells and rat-l fibroblasts, respectively. Competition assays were 
performed with 5 ng of nonradioactive DNA that corresponded to the 
probe (lanes 3 and 8) or 50 ng of nonradioactive DNA that had a mutated 
residue -30 (lanes 4 and 9). -28 (lanes 5 and 10), or -38 (lanes 6 and 11). 
30 The probe and the retarded /r£t/tsr-acting factors present in fibroblasts 
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(F(^/+24)'^I ^^(-24/+24)"^) basophilic leukemia-1 cells 

(^(-40/+24)"I' ^^(-40/+24)*') indicated on the right 

Figure 13 . Location of they4/u Elements and the HpalVMspl Sites 
in the Human Serglycin Gene. Figure 13A: the locations of the 21 Alu 
elements in the 5'-flanking region, intron 1, and intron 2 of the serglydn 
gene are depicted. The locations of the t^o Alu elements containing only 
the left arm are identified with 1/2. The three exons are boxed (■). 
Figure 13B: the locations of the Hpall/Mspl sites (5*-CX;GG-3') in the 
serglycin gene are indicated by the vertical lines. The letteis depict the 
location of the probes used lo determine the extent of methylation of 
these sites. Sites in the serglycin gene in HL-60 cells that are at least 
partially methylated are indicated by closed circles, and nonmethylated 
sites are indicated by open circles. 

DETAILED DESCRIPTION OF THE PREFERRFn 
EMBODIMENTS 

I. Definitions 

In tlie description that follows, a number of terms used in 
recombinant DNA (rDNA) technology are extensively utilized. In order 
to provide a clear and consistent understanding of the specification and 
claims, including tlie scope to be given such terms, the following 
definitions are provided. 

Gene. A DNA sequence containing a template for a RNA 
polymerase. The RNA transcribed fi^om a gene may or may not code for 
a protein. RNA that codes for a protein is termed messenger RNA 
(mRNA) and, in eukaryotes, is transcribed by RNA polymerase II. 
However, it is also known to construct a gene containing a RNA 
polymerase II template wherein a RNA sequence is transcribed which has 
a sequence complementary to that of a specific mRNA bui is not normally 
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translatei Such a gene construct is herein termed an "antisense RNA 
gene" and such a RNA transcript is termed an "antisense RNA." 
Antis^ise RNAs are not normally translatable due to the presence of 
translational stop codons in the antisense RNA sequence. 

A "complementary DNA" or "cDNA" gene includes recombinant 
genes synthesized by reverse transcription of mRNA and from which 
intervening sequences (introns) have been removed. 

Cloning vehicle, A plasmid or phage DNA or other DNA 
sequence which is able to replicate autonomously in a host cell, and that 
is character^ed by one or a small number of endonuclease recognition 
sites at which such DNA sequences may be cut in a determinable fashion 
without loss of an essential biological function of the vehicle, and into 
which DNA may be spliced in order to bring about its reph'cation and 
cloning. The cloning vehicle may further contain a marker suitable for 
use in the identification of cells transformed with the cloning vehicle. 
Markers, for example, are tetracydine resistance or ampicillin resistance. 
The word "Vector" is sometimes used for "cloning vehicle." 

E?q3ressTon vehicle. A vehicle or vector similar to a cloning vehicle 
but which is capable of expressing a gene that has been cloned into it, 
after transformation into a host Accordingly to the invention, the cloned 
gene or coding sequence (the gene of interest) is usually placed under the 
control of (Le., operably linked to) certain control sequences such as the 
promoter sequences and/or regulatory elements of the invention. 
Expression control sequences will vary and may additionally contain 
transcriptional elements such as enhancer elements, termination 
sequences, tissue-specificity elements in addition to those of the invention, 
and/or translational initiation and termination sites. 

Proteoglycan. This term as used throughout the specification and 
claims means mammalian and especially human "hematopoietic cell 
proteoglycan" that contains glycosaminoglycan chains covalently bound to 
the proteoglycan's core protein. 
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Serglvcin . Serglycin is the peptide core of hematopoietic cell 
secretory granule proteoglycan. The term is meant to inchide peptide 
fragments of hematopoietic cell secretory granule proteoglycan wherein 
the peptide core protein contains less than the naturally-occurring number 
5 of amino adds, but which retains biological (functional or structural) 
activity. Example of the functional activity seiglydn is the abiUty.to 
induce a spedfic biological response in the same manner that the native 
non-recombinant protein does, such as the ability to be conjugated into 
a specific proteoglycan form. An example of a structural activity is the 
10 ability to bind antibodies which also recognize the native non-recombinant 
protein. 

The term is also used to include serglycin fusion proteins, that is, 
a peptide which comprises the sequence of a naturally-occurring serglydn 
or a biologically active fragment thereof together with one or more 
15 additional flanking amino acids, but which still possesses hematopoietic 
cell secretoiy granule proteoglycan biological (functional or structural) 
activity. 

Transcripti onal regulatory element . A transcriptional regulatory 
element (or DNA regulatory element) is a DNA sequence that, when 

20 operably linked to a gene of interest, is capable of altering the 
transciption of such gene of interest in a specific way characteristic of 
such element Transcriptional regulatory elements include promoters, 
enhancers, suppressors, transcriptional start sites, transcriptional stop sites, 
polyadenylation sites, and the like. 

^ Functional Derivative . A "functional derivative" of the DNA 

regulatory elements of the invention is a DNA sequence that possesses a 
least a biologically active fragment of the sequence of the regulatory 
elements of the invention; by "biologically active" fragment is meant that 
the fragement retain a biological activity (either functional or structural) 

50 that is substantially similar to a biological activity of the full-length DNA 
element. A biological activity of a DNA regulatory element of the 
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inveation is its ability to alter transcription in a manner known to be 
attributed to the foll-Jength element Hence, biologically active fragments 
of file suppressor element of the invention will retain the ability to inhibit 
or repress transcription; biologically active fragments of the enhancer 
5 element of the invention will retain the ability to stimulate transcription; 
and biologically active fragments of the promoter sequence of the 
invention will retain the ability to promote transcription. 

The term "functional derivative" is intended to include the "frag- 
ment^" "variants," "analogues," or "cliemical derivatives" of a molecule. 
10 FragmenL A "fragment" of a nucleotide or peptide sequence is 

meant to refer to a sequence that is less than that believed to be the "fiill- 
lengdi" sequence- 

Variant A "variant" of a molecule is meant to refer to allelic 
variations of such sequences, that is, a sequence substantially similar in 
15 structure and biological activity to either the entire molecule, or to a 
fragment thereo£ 

n. Genetic Engineerings of the Regulatory Elements of the Invention 

Provided herein are transcriptional cis-acting elements of 
hematopoietic cells: an enhancer element, a suppressor element and a 

20 novel promoter element The transcriptional ay-acting elements of the 
invention are naturally found in the 5' regulatory region of the serglycin 
gene. In addition, provided herein are £ra/z5^-acting factors, such factors 
specifically binding to the ctr-acting elements of the invention. The 
process for genetically engineering the genetic regulatory elements of the 

25 invention, or the rran^-acting factors of the invention, is facilitated through 
the cloning of genetic sequences that contain the sequence for such 
regulatory elements or factors. The 333 base pair (bp) nucleotide 
sequence 5' of the transcription-initiation site of the mouse gene is nearly 
identical to the corresponding region of the human gene (Nicodemus 
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ei oL, J. BioL Chem. 265:5889-5896 (1990)). Thus, the mouse or human 
sequence may be used interchangeably for the cloning of the alternate 
spedeSy or the cloning of similar sequences in other species. 

He regulatory elements may be cloned directly, using, for example, 
5 promoter probe vectors and the like. For the identification of those 
regions of the serglydn gene that provide the cu*acting enhancer, 
suppressor, and/or promoter, rat basophilic leukemia-1 cells, mouse 
WEHI-3 cells, rat-1 fibroblasts, and mouse 3T3 fibroblasts may be 
transiently transfected with plasmid constructs containing various lengths 

10 of the 504-bp 5' flanking region of the mouse serglycin gene linked to a 
gene of interest, for example, the human growth hormone (hGH) gene, 
so as to provide a reporter, expression of which indicates transcriptional 
activity. Rat basophilic Ieukemia-1 cells and mouse WEHI-3 cells are 
preferred because they contain cytoplasmic granules and express large 

15 amounts of serglycin, whereas no serglycin transcript is present in either 
fibroblast line (Tantravahi el aL, Proa Natl Acad. ScL USA »J:9207-9210 
(1986)), 

The hGH transient expression system is preferred because it is at 
least 10-fold more sensitive than the CAT system (Selden et aL Mol Cell 

20 Biol (5:3173-3179 (1986)) or other systems that are based on the 
expression of )S-galactosidase (An et al, Mol Cell Biol 2:1628-1632 
(1982)) and xanthine-guanine phosplioribosyl transferase (Chu et al. 
Nucleic Acids Res, i5:2921-2930 (1985)). This increased sensitivity enables 
hGH levels to be measured after transfection with a very small amount of 

15 plasmid, thus avoiding potential problems of competition (Selden el al, 
Mol Cell Biol 6:3173-3179 (1986)). The hOH transient expression system 
is well-suited for use because the plasmids are known in the art (such as 
pXGHS for example) that can be used as an internal positive control for 
normalizing the efficiency of transfection. thereby facilitating the 

10 interpretation of data from separate experiments. 
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The specificity of tlie transcriptional regulatory elements of the 
invention is sud^that reporter mRNA expression using the transcriptional 
regulatory elements of the invention may be detected in rat-1 fibroblasts 
transfected with appropriate promoter probe plasmids that contain the 
desnred regulatory dements operably linked to Ae reporter gene. For 
example, construct pPG(-118/+24)hGH, containing the promoter and 
enhancer element of the invention will provide high levds of reporter 
expression in such host cells. 

In addition, high levels of reporter expression will be detected in 
rat basophilic Ieukemia-1 cells transfected with construcls containing the 
supressor, enhancer, and promoter of the invention, (for example, 
pPG(-504/+24)hGH) or just the enhancer and promoter of the invention 
(for ecample pPG(.118/+24)hGH). 

In contrast, lesser amounts reporter mRNA will be detected in rat 
basophilic Ieukemia-1 cells and rat-l fibroblasts transfected with promoter 
probe vectors containing only the promoter of the invention (for example, 
pPG(-40/+24)hGH). No reporter mRNA will be detected in rat 
basophilic leukemia-1 cells transfected with p0GH or in rat-I fibroblasts 
transfected with constructs containing the suppressor, enhancer and 
promoter of tfie invention (for example pPG(-504/+24)hGH) or p^GH- 
Because large amounts of hGH are detected in the culture media of rat 
basophiUc leukemia-1 cells and rat-1 fibroblasts that contain abundant 
levels of hGH mRNA and because lesser amounts of hGH are detected 
in the culture media of cells containing intermediate levels of hGH 
mRNA, transcription and translation of the hGH gene are related in both 
transfected cell types. 

The results of the transfections should be normalized to that 
obtained with a reference plasmid, such as. for the growth hormone 
r^orter, pXGHS. Rat basophilic leukemia-1 cells produce more reporter 
(18-foId more hGH) than transfected rat-l fibroblasts. Likewise, mouse 
WEHI-3 cells produce more reporter than transfected mouse 3T3 
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fibroblasts (pPG(-504/H-24)hGH produced 20-fold more hGH in mouse 
WEHI-3 cells than transfected mouse 3T3 fibroblasts). 

. Based on the results such as those discussed above and herein, the 
presence otds-acting regulatory elements the 5' flanking region of a gene, 
5 and/or in the first intron of such gene, and especially of the serglydn 
gene, may be established. Such serglydn gene elements preferentially 
enhance the constitutive transcription of a gene of interest in 
hematopoietic cells or preferentially suppress transcription of a gene of 
interest in fibroblasts, although small differences may be present due to 

10 other fectors. 

The sequences of intron 1 of the serglycin gene may act in concert 
to regulate transcription of a gene that is operably linked to the serglydn 
gene promoter in different cell types. When it is desired to utilize this 
concerted action, intron 1 sequences of the serglydn gene may be inserted 

15 into the coding sequence of a gene of interest such that what becomes 
exon 1 has approximately the same size as exon 1 of serglydn. and in a 
manner such that the reading frame of the coding sequence is not altered, 
and the normal recognition sequences at the flanking regions of the intron 
are provided, so as to allow subsequent excision of the intron. 

20 To locate the c£r-acting elements of the invention more precisely, 

additional plasmid constructs may be prepared that contain progressively 
less of the 5' flanking region of the serglycin gene. For example, the 
transfection of rat basophilic ieukemia-1 cells and rat-1 fibroblasts with 
these shortened constructs of the 5 flanking region of the mouse 

25 serglydn gene reveal that a civ-acting element resides between residues 
-250 and -190 and suppresses transcription of this gene, and that this 
suppressor element is more dominantly active in rat-1 fibroblasts than in 
rat basophilic leukemia-1 cells. In a similar manner, such experiments 
revealed that an enhancer element resides between residues -118 and -81 

30 of the mouse serglycin gene thai not only appears to be important for the 
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posithre constitutive transcripuon of this gene but also is dominantly active 
in rat basophilic leukemia-l cells. 

Rat basophihc leukemia-l cells and fibroblasts produce 
substantially more reporter protein when transfected with a construct 
5 containing the enhancer and promoter of the invention (for example, 
pPG(-118M4)-SV40-hGH) than with a control that provides a foreign 
element (for example pSV40-hGH). Typical of other enhancers, the 
enhancer activity of the enhancer of the invention is not diminished by 
changing its orientation and its distance from the SV40 early promoter in 

10 the plasmid. 

The promoter element of the invention is a unique sequence that 
provides an alternate to the classical TATA box. For most genes, 
transcription is initiated -30 bp downstream of the proximal end of the 
promoter, which usually is a TATA box. Because no reporter protein is 

15 detected when rat basophilic leukemia-l cells and rat-1 fibroblasts are 
transfected with constructs containing only 20 nucleotides of the proxhnal 
endoftheserglydngene. (pPG(-20/+24)hGH). but some reporter protein 
is produced by cells transfected with constructs containing at least 40 
nucleotides (pPG(-40/+24)hGH), the proximal etement of the promoter 

20 of the invention resides between residues -40 and -20. Inasmuch as no 
TATA box is present in this region (Avraham et oL. L Biol Chem. 
264im\9-\^6 (1989); Nicodemus ei al.. J. BioL Chem. 26J:5889-5896 
(1990)), the TCTAAAA sequence at residues -31 to -25 may serve as an 
alternative element. 

25 To demonstrate the important residues of the elements of the 

invention, they may be mutated, using techniques known in the arL 
For example, to demonstrate the important residues of the promoter 
element of the invetnion. residues -28. -30. or -38 were mutated. Based 
on the relative amount of reporter produced in the transfected cells, it 
was shown that the 5' flanking region containing the TCTAAAA 
sequence functions as a TATA box equivalent. 



30 
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Alternatively to using promoter probe vectors for the cloning of the 
regulatoiy elements of the invention, the coding sequence of the serglydn 
gene (previously called the secretory granule proteoglycan peptide core 
protein) may be cloned and used to identify clones or DNA containing 
5 the desired regulatoiy elements operably linked to the cloned coding 
sequence. Hie discussion below, while it specifically refers to cloning of 
the serglydn gene, may also be adapted by those of skill in the art for the 
cloning of the mz/u-acting factors of the invention. 

The regulatory elements of the genomic DNA of the invention may 

10 be obtained in association with the 5' promoter region of the serglycin 
gene. For example, when rat-1 fibroblasts were stably transfected with the 
mouse genomic clone, X-MG-PGl, two cell lines were obtained that 
expressed low levels of the 1.0-kb serglydn mRNA (Avraham et oL, /. 
BioL CheiTL 2(^:16719-16726 (1989)). This finding indicated that 

IS X-MG-PGl contained the entire mouse serglydn gene, including, perhaps, 
some of the regulatoiy elements within its promoter region. SI nuclease 
mapping and primer extension analysis revealed that the primary 
transcription-initiation site for this gene in mouse bone marrow-derived 
mast cells (BMC) resides —40 nucleotides upstream of the translation- 

20 initiation site (Avraham el aL, J, BioL Chem. 264:16719-16726 (1989)). 

As used herein, the term "genetic sequences'* is intended to refer 
to a nucleic acid molecule (preferably DNA). Genetic sequences that are 
capable of providing the regulatory elements of the invention are derived 
from a variety of sources, including genomic DNA, synthetic DNA, and 

25 combinations thereof. Genetic sequences that are capable of encoding 
serglydn may further be derived from mRNA or cDNA. 

Genomic DNA containing the serglycin gene can be extracted and 
purified from any eukaryotic and especially mammalian cell that has this 
gene in its genome by means well known in the art (for example, see 

30 Guide to Molecular Cloning Techniques. S.L. Berger et al. eds-. Academic 
Press (1987)). 
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Serglycm mRNA can be isolated from any cell which produces or 
espresses Ais protein. Serglycin mRNA can ako be used to produce 
cDNA by means weU known in the art (for example, see Guide to 
Molecular Oonmg Techniques, S.L. Berger ei oL. eds. Academic Press 
5 (1987)> Such cdl sources include, but are not limited to, fresh ceD 
prqjaiationsand cultured cdl lines, especially fresh or cultured connective 

tissue mast cells, mucosal mast cells, basophils, natural kfller cells, 
cytotoxic T lymphycytes, eosinophils, neutrophils, macrophages and 
platelets. 

10 Preferably, the mRNA preparation used win be enriched in mRNA 

coding for serglycin, either naturally, by isolation from a cells which are 
producing large amounts of the protein, or in vitro, by techniques 
commonly used to enrich mRNA preparations forspecific sequences, such 
as sucrose gradient centrifugation. or both. 

15 For cloning into a vector, suitable DNA preparations (genomic 

DNA containmg the regulatoiy elements of the invention or cDNA 
encoding the serglycin) are randomly sheared or enzymatically cleaved, 
respectively, and Kgated into appropriate vectors to form a recombinant 
gene (either genomic or cDNA) library. 

20 A DNA sequence providing the regulatory elements of the 

invention, or the sergl>'cin coding sequence, may be inserted into a DNA 
vector in accordance with conventional techniques, including blunt-ending 
or staggered-ending termini forUgation, restriction en^e digestion to 
provide appropriate termini, filling in of cohesive ends as appropriate. 

25 alkaline phosphatase treatment to avoid undesirable joining, and ligation 
with appropriate ligases. Techniques for such manipulations are disclosed 
by Maniatis, T^etaL, supra, and are well known in the arL 

A seiglydn clone may be identified by any means which specificaUy 
selects for serglydn DNA such as. for example, a) by hybridization with 

30 an appropriate nucleic acid probe(s) containing a sequence specific for the 
DNA of this protein, or b) by hybridization-selected translational analysis 
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in which native mRNA which hybridizes to the clone in question is 
translated in vitro and the translation products are further characterized, 
or, c) if the cloned genetic sequences are themselves capable of e)q)ressing 
mRNA, by immunoprecipitation of a translated serglydn protein product 
5 produced by the host containing the clone- The ability to specifically bind 
antibody against serglydn or its proteoglycan, the ability to elidt the 
production of antibody capable of binding to serglydn as its proteoglycan, 
and/or the ability to provide a serglydn proteogiycan-associated function 
to a redpient cell, are all examples of the biological properties of the 
10 serglydn proteoglycan. 

Oligonucleotide probes specific for theserglycin gene are useful for 
the identification of genomic or cDNA clones to this protein, or useful for 
the identification of clones to the regulatory elements of the invention, 
can be designed from knowledge of the amino acid sequence of the 
15 protein's peptide core, or from knowledge of the nucleotide sequence of 
the regulatoiy element, respectfully. 

When designing a probe against a peptide sequence, the sequence 
of amino acid residues in a peptide is designated herein either through 
the use of their commonly employed three-letter designations or by their 
20 single-letter designations. A listing of these three-letter and one-letter 
designations may be found in textbooks such as Biochemistry, Lehninger, 
A, Worth Publishers, New York, NY (1970). When the amino add 
sequence is listed horizontally, the amino terminus is intended to be on 
the left end whereas the carboxy terminus is intended to be at the right 
15 end. The residues of amino acids in a peptide may be separated by 
hyphens. Such hyphens are intended solely to facilitate the presentation 
of a sequence. 

When designing probes against a peptide sequence, because the 
genetic code is degenerate, more than one codon may be used to encode 
JO a particular amino acid (Watson, J.D., In: Molecular BUAogy of the Cent, 
3rd Ed., W.A. Benjamin, Inc., Menio Park. CA (1977), pp. 356-357). The 
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peptide ftagments are analyzed to identify sequences of amino adds that 
may be encoded by oligonucleotides having the lowest degree of 
deg«ieracy. Hiis is preferably accomplished by identifying sequences that 
contain amino acids which are encoded by only a single codon. 

5 Although occasionally an amino add sequence may be encoded by 

onty a single oUgonudeotide sequence, frequently the amino add 
sequence may be encoded by any of a set of similar oUgonudeotides. 
Importantly, whereas all of the members of this set contain oligo- 
nudeotide sequences that are capable of encoding the same peptide. 

10 fragmentand, thus, potentially contain the same oligonudeoU'de sequence 
as the gene which encodes the peptide fragment, only one member of the 
set contains the nudeotide sequence that is identical to the exon coding 
sequence of the gene. Because this member is present within the set, and 
is capable of hybridizing to DNA even in the presence of the other 

15 members of the set, it is possible to employ the unfractionated set of 
oBgonudeotides in the same manner in which one would employ a single 
oligonucleotide to clone the gene that encodes the peptide. 

Using the genetic code (Watson. In: Molecuiar Biology of the 
Gene, 3rd Ed.. WJV. Benjamin. Inc.. Menlo Park, CA (1977)). one or 

20 more different polynudeotides or oligonudeotides can be identified from 
the amino add sequence, eadi of which would be capable of encoding the 
sergfydn protein or fragments thereof. TTie probability that a particular 
polynudeotide will, in feci, constitute the actual serglydn encoding 
sequence can be estimated by considering abnormal base pairing 

25 relationships and the frequency with which a particular codon is actually 
used (to encode a particular amino add) in eukaiyotic cdls. Such "codon 
usage rule^ are disdosed by Lathe, EL, er oL. J. Molea Biol. I83A-12 
(1985). Using the "codon usage rules" of Uthe, a single polynucleotide 
sequence, or a set of polynudeoude sequences, that contain a theoretical 

30 "most probable" nudeotide sequence capable of encoding the serglydn 
sequences is identified. 
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Tlie suitable polynucleotide, or set of polynucleotides, that is 
capable of encoding the serglycin gene, or fragment thereof may be syn- 
thesized by means well known in the art (see, for example, Synthesis and 
Application of DNA and RNA, S.A. Narang, ed., 1987, Academic Press, 
5 San Diego, CA) and employed as a probe to identify and isolate the 
cloned serglycin gene by techniques known in the art Techniques of 
nudeic add hybridization and clone identification are disclosed by 
Maniatis, et oL^ (In: Molecular Cloning A Laboratory Manual^ Cold 
Spring Hart>or Laboratories, Cold Spring Harbor, NY (1982)), and by 

10 Hames, B.D., et ah, (In: Nucleic Acid Hybridization, A Practical Approach. 
IRL Press, Washington, DC (1985)), which references are herein incor- 
porated by reference. Those members of the above-described gene library 
that are found to be capable of such hybridization are then analyzed to 
determine the extent and nature of the serglydn encoding sequences that 

15 they contain. 

To facilitate the detection of the desired serglydn DNA encoding 
sequence, the above-described DNA probe may be labeled with a detec- 
table group. Such detectable group can be any material having a 
detectable physical or chemical property. Such materiak have been well- 

20 developed in the field of nucleic acid hybridization and in general most 
any label useful in such methods can be applied to the present invention. 
Particularly useful are radioactive labels, such as ^^P, ^H. ^'^C, ^^S, ^^I, 
or the like. Any radioactive label may be employed which provides for an 
adequate signal and has a sufHcient half-life. The oligonucleotide may be 

25 radioactively labeled, for example, by "nick-translation" by well-known 
means, as described in, for example, Rigby, P.J.W., e/ a/., /. MoL BioL 
113:231 (1977) and by T4 DNA polymerase replacement synthesis as 
described in, for example, Deen, K.C., et ai. Anal Biochem, 135:456 
(1983). 

30 Alternatively, polynucleotides are also useful as nucleic acid 

hybridization probes when labeled with a non-radioactive marker such as 
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biotin, an en^e or a fluorescent group. See, for example, Leaiy. JJ., 
et aL, Proc NatL Acad. ScL USA 80:4045 (1983); Renz, M^etaL, NucL 
Adds Res. 12:3435 (1984); and Renz. M., EMBOJ. tf:817 (1983). 

TTius, in summary, the actual identification of the amino add 
5 sequence of sei^Iydn permits the identification of a theoretical "most 
probable" DNA. sequence, or a set of such sequences, capable of encoding 
soch a peptide. By constructing an oligonucleotide complementary to this 
theoretical sequence (or by constnicting a set of oligonudeotides 
complementary to the set of "most probable" oligonudeotides), one 

10 obtains a DNA molecule (or set of DNA molecules), capable of function- 
ing as a probe(s) for the identification and isolation of clones containing 
the seiglydn gene, and thus die regulatory elements of the invention. 

In an alternative way of doning the serglydn gene, a library is 
prepared using an e3q>ression vector, by cloning DNA prepared firom a cell 

15 possessing, and preferably, capable of expressing, serglydn, into an 
^ression vector. The cDNA library is flien screened for members that 
express seigfydn, for example, by screening tiie library witii antibodies to 
the protein, such as the antibody depicted in Figure 6. 

In another embodiment, a previously described rat 12 cell-derived 

20 cDNA of a related proteoglycan, pPG-1, (disdosed in Bourdon el aL, 
Proc Nad. Acad. ScL USA 82:1372 (1985)) is used to identify a sequence 
encoding serglycin. For example Southern blots of digested genomic 
DNA may be probed with nick-translated pPG-1 or pPG-M (a gene 
^edfic 489 bp Ssp I ~> 3'end fi^gment of pPG-I), Tantravahi el oL, 

25 Proc Nad. Acad. ScL USA 83:9201 (1986), under reduced stringency if 
necessary, to allow for mismatch between the sequence expressed in the 
different species. 

TTie above discussed methods are, therefore, capable of identifying 
graietic sequences that are capable of encoding the serglydn or fragments 

30 of this protein. Such coding, sequences may then be used to identify 
clones containing the transcriptional regulator)' elements of the invention. 
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For example, using the techniques described above, a number of 
DNA fragments were identified when a human genomic DNA blot was 
probed under conditions of low stringency with a rat serglydn cDNA 
(Stevens et oL, /. BioL Chem. 26301X1 (1988)). Nevertheless, by probing 
5 a human promyelomonocytic H^60 celJ-derived cDNA libraiy under 
conditions of low stringency with the rat cDNA, a cDNA was isolated and 
characterized that encodes human serglycin (Stevens et oL, /. BioL Chenu 
263:72S7 (1988)). Sequence analysis of a resulting cDNA clone indicated 
that in the human this proteoglycan peptide core is only 17.6 kDa and 

10 contains an 18 amino acid glycosaminoglycan attachment region consisting 
primarily of alternating serine and glycine. A single gene that resides on 
chromosome 10 encodes this human protein (Stevens et oL, L BioL ChenL 
263:T2S7 (1988); Nicodemus ei al., J. BioL Chem. 265:5889 (1990); Mattei 
et aL, Human Genetics 82'Xl (1989)). A human genomic library was 

15 probed under conditions of high stringency with a 5' fragment of the HL- 
60 cell cDNA to isolate two 18-kb genomic fragments that taken together 
contain the entire human serglycin gene (Nicodemus el al., 7. BiijL Chem. 
265:5889 (1990)). A restriction map of this human gene was constructed, 
and the genomic fragments subcloned into Bluescrtpt" plasmid, and the 

20 nucleotide sequence of the entire 16.6 kb human gene determined plus 0.7 
kb of 5' flanking DNA, using techniques known in the art. 

In addition, a 1.0-kb cDNA that encodes mouse serglycin was 
isolated from a mouse bone marrow-derived mast cells-derived cDNA 
libraiy. When the predicted amino acid sequences of the mouse, rai, and 

25 human serglycin were compared, the N-terminus (not the serine-glycine 
rich glycosaminoglycan-attachment region) was found to be the most 
conserved region. This surprising finding suggests that N tenninus of the 
translated peptide core is important for the structure, function, and/or 
metabolism of this family of proteoglycans. Areas of identity in the 3' and 

30 5' untranslated regions in the human, rat, and mouse proteoglycan cDNAs 
were also observed (Avrahani ei al., Proc. Nail. Acad. Sci. USA 86:3763 
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(1989)). Interestingly, these 3' and 5' conserved untranslated nucleotide 
sequences were almost identical in the corresponding regions of the 
cDNAs that encode human mast cell tryptase (Miller et aL, J. Oin. Invest 
84:1138 (1989)), dog mast cell tiyptase (Vanderslice et aL. Biochemistry 
5 29:4148 (1989)), mouse mast cdl protease-2 (Serafin et aL. J. BioL Chem. 
265:423 (1990)), and rat mast cell protease-II (Ben% etaL,J. BioL Chem. 
262:5377 (1987)), suggesting that these nucleotide sequences may be 
important for coordinated regulation of those genes that encode proteins 
destined to reside in the secretory granules of hematopoietic cells. 

10 To isolate the mouse serglycin gene, a mouse genomic DNA libraiy 

was probed under conditions of high stringenqr with a 3' gene specific 
fragment of the mouse bone marrow-derived mast cells-derived cDNA. 
An ~18 H) genomic clone (kN4G-PGl) which contains the entire gene 
tiiat encodes mouse serglycin was isolated. The exon^ntron organization 

15 of Uie mouse gen& was determined, as well as the transcription-initiation 
site and die 504-bp nucleotide sequence that is upstream of the gene. 

Typically, transcription is initiated -30 bp downstream of an 
el^ent witiiin the proximal end of a gene's promoter (defined in this 
case as the smallest amount of nucleotide sequence that must be present 

20 to get minimal transcription of a gene in a cell). In most eukaryotic 
genes, their promoters contain either a TATA box (Breathnach et aL, 
Am. Rev. Biochem. 50-349 (1981)) or a GC-rich element (Sehgal el aL. 
MoL Cell BioL 8-3\GQ (1988)). In rarer cases, such as the terminal 
deoxynudeotidyltransferase gene (Smale, S.T., and Baltimore, D. (1989) 

25 Cell 57:103 (1988)), a third type of promoter region is present which lacks 
these specific transcriplion-iniuation control sequences. 

The mouse serglycin gene does not contain either a classical TATA 
box or a GC-rich element -30 bp upstream of the transcription-initiation 
site. Hierefore, its promoter appears to belong to the rarer, third class of 

30 promoters. 
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Accordingly, the above discussed methods are also capable of 
directly identifying clones containing genetic sequences containing the 
transcriptional regulatoiy elements of the invention. 

The above discussed methods are also capable of being adapted for 
5 the identification of clones directed to the mzm-acting factors of the 
invention. Especially, clones capable of expressing such trans-acting 
£actors may be identified utilizing the target sequence to which they bind 
(in a double-stranded DNA form) to detect their presence in protein- 
DNA binding assays. Such assays are well known in the art 
10 In order to further characterize such genetic sequences, and, in 

order to produce recombinant protein under the transcriptional control 
of such sequences, such transcriptional regulatory elements must be 
provided to an appropriate host. 

ni. Expression of Proteins Operablv-linked to the Transcriptional 
15 Regulatory Elements of the Invention 

As used herein, "heterologous protein" is intended to refer to a 
peptide sequence that is heterologous to the transcriptional regulatory 
elements of the invention. A skilled artisan will recognize that, if desired, 
the teaching herein will also apply to the expression of genetic sequences 
20 encoding serglydn homologous to such regulatory elements. 

To express a heterologous protein under the control of the 
transcriptional regulatory elements of the invention, the heterologous 
protein must be "operably-linked" to the regulatory element. An operable 
linkage is a linkage in which a desired sequence is connected to a 
25 transcriptional or translational regulatory sequence (or sequences) in such 
away as to place expression (or operation) of the desired sequence under 
the influence or control of the regulatory sequence. 

Two DNA sequences (such as a sequence encoding a heterologous 
protein and a promoier region sequence linked to the 5' end of the 
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encoding sequence) are said lo be operably linked if induction of 
promoter function results in the transcription of the DNA encoding die 
heterologous protein and if die nature of the linkage between die two 
DNA sequences does not (1) result in the introduction of a frame-shift 
5 mutation, (2) interfere with the abiUty of die expression regulatory 
sequaices to direct the expression of the heterologous protein DNA, or 
(3) interfere with the ability of die heterologous protein template to be 
transcribed by the promoter region sequence. Thus, a promoter region 
would be operably Imked to a DNA sequence if Ure promoter were 

10 capable of effecting transcription of that DNA sequence. 

Li a similar manner, a transcriptional regulatory element diat 
stimulated or repressed promoter function may be operabiy-linked to such 
promoter. Exact placement of the element in the nucleotide chain is not 
critical as long as the element is located at a position from which the 

15 desnred effects on the operably linked promoter may be revealed. A 
nucleic acid molecule, such as DNA. is said to be "capable of e3q)ressing" 
a poIypq)tide if it contains expression control sequences which contain 
transcriptional regulatory information and such sequences are operably 
linked to the nucleotide sequence which encodes the polypeptide. 

20 For the complete control of heterologous gene expression, all 

transcriptional and translational regulatory elements (or signals) that are 
operably linked to a heterologous gene should be recognizable by the 
appropriate hosL By "recognizable" in a host is meant that such signals 
are functional in such hosL 

25 The cloned transcriptional regulatory elements, obtained through 

the methods described above, and preferably in a double-stranded form, 
may be operably linked to a heterologous gene, preferably in an 
expression vector, and introduced into a host cell, preferably eukaryote 
cell, and most preferably, a eukaryotic cell of the hematopoietic cell 

30 origin, to produce recombinant heterologous protein. 
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Expression of the heterologous protein in different hosts may result 
in different post-translationai modifications that may or may not alter the 
properties of the heterologous protein. Especially preferred hosts are cells 
either in wVa, or in tissue culture that provide post-translational 
modifications to the heterologous protein that include folding and/or 
glycosylation at sites similar or identical to that found for the native 
proteoglycan. 

Appropriate cells of hematopoietic cell origin include, for example, 
hematopoietic cells that participate in immune and inflammatory 
responses, including connective tissue mast cells, mucosal mast cells, 
basophils, natural killer cells, cytotoxic T lymphocytes, eosinrphils, 
neuiTophib, macrophyages, and platelets. For example, rat basophilic 
leu!f<emia-1 .sells (ATCC CRL-1378), mouse bone marrow derived mast 
celk mouse mast cells immortalized with Kirsten sarcoma virus, normal 
mo-ase mast cells that have been co-cultured with mouse fibroblasts, or 
mouse myelomonocytic WEHI-3 cells (ATCC TIB-68) are useful. Razin 
et oL, /. ImmuTL 132:U79 ]0S4)\ Levi-Schaffer et ai, Proc. Noli Acad. 
ScL (USA) 55:6485 (1986^ tnd Reynolds et al,, "Immortalization of Murine 
Connective Tissue-type Mast Cells at Multiple Stages of TTieir 
Differentiation by Cocult^i e of Splenocytes with Fibroblasts that Produce 
Kirsten Sarcoma Virus," 7. Biol. Chem, 2(53:12783-12791 (1988). See 
Example 5, below. Methods for the long term in vitro proliferation of 
pluripotent bone marrow stem cells are known {Handbook of the 
Hematopoietic Microenvironmeni, M. Tavassoli, ed., Humana Press, Inc., 
Qifton, New Jersey, 1989). 

The precise nature of the regulatory regions needed for gene 
expression may vary between species or cell types, but shall in general 
include, as necessary, 5' non-transcribing and 5' non-translating (non- 
coding) sequences involved with initiation of transcription and translation 
respectively. Especially, at a minimum, such 5' non-transcribing control 
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sequences will include a region which contains a promoter for 
transcriptional control of the operably linked gene. 

The promoter preferably is the seigfydn gene promoter of the 
invention. However, the enhancer and suppressor transcriptional 
5 regulatoiy dements of the invention may be operably linked to any 
promoter that is function in the desired host cell. A wide variety of 
transcrq)tionaI and translational reguliatory sequences can be employed, 
operably Imked to a transcriptional regulatory element of the invention, 
depending upon the nature of the eukaryolic hosL In eukaryotes, 

10 where transcription is not linked to translation, such control regions may 
or may not provide an initiator methionine (AUG) codon, depending on 
whether the operably linked heterologous sequence contains such a 
methionine. Such regions will, in general, include a promoter region 
sufficient to direct the initiation of RNA synthesis in the host celL 

15 Promoters from heterologous mammalian genes that encode an mRNA 
product capable of translation are preferred, and especially, strong 
promoters such as the promoter for actin, collagen, myosin, eta, can be 
employed provided they also function as promoters in the host cell, and 
provided that their function is also capable of being control by the desired 

20 positive or suppressor of the invention. 

As is widely known, translation of eukaryotic mRNA is initiated at 
the codon that encodes the first methionine. For this reason, it is 
preferable to ensure that the linkage between a eukaryotic promoter and 
a DNA sequence that encodes the heterologous protein does not contain 

25 any intervening codons that are capable of encoding a methionine. The 
presence of such codons resuhs either in a formation of a fusion protein 
(if the AUG codon is in the same reading frame as the DNA encoding 
the heterologous protein) or a frame-shift mutation (if the AUG codon 
is not in the same reading frame as the DNA encoding the heterologous 

30 protein. 
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U desired, a fusion product of the heterologous protein may be 
constructed. For example, the sequence coding for the heterologous 
protein may be linked to a signal sequence which will allow secretion of 
the protein from, or the compartmentalization of the protein in, a 
S particular host Such signal sequences may be designed with or without 
spedfic protease sites such that the signal peptide sequence is amenable 
to subsequent removal. Alternatively, the native signal sequence for this 
protein may be used. 

Hie transcriptional initiation regulatory elements of the invention 

10 can be selected to allow for repression or activation, so that expression of 
the operably linked genes can be modulated. Translational signals are not 
necessaiy when it is desired to express antisense RNA sequences. 

If desired, the non-transcribed and/or non-translated regions 3' to 
the sequence coding for the heterologous protein can be obtained by the 

15 above-described cloning methods. The 3'-non-transcribed region may be 
retained for its transcriptional termination regulatory sequence elements; 
the 3-non-transIated region may be retained for its translational 
termination regulatory sequence elements, or for those elements that 
direct polyadenylation in eukaryotic cells. Where the native expression 

20 control sequences signals do not function satisfactorily host cell, then 
sequences functional in the host cell may be substituted. 

To transform a mammalian cell with the DNA constructs of the 
invention many vector systems are available, depending upon whether it 
is desired to insert the heterologous protein DNA construct into the host 

25 cell chromosomal DNA, or to allow it to exist in an extrachromosomal 
form. 

If the heterologous protein's DNA sequence and an operably 
linked promoter is introduced into a recipient eukaiyotic cell as a non- 
replicating DNA (or RNA) molecule, which may either be a linear 
30 molecule or, more preferably, a closed covaleht circular molecule that is 
incapable of autonomous replication, the expression of the heterologous 
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protein may occur through the transient expression of the introduced 
sequence. 

Genetically stable transformants may be constructed with vector 
systems, or transformation systems, whereby the heterologous protein's 
5 DNA is integrated into the host chromosome. Such integration may occur 
de now within the cell or, in a most preferred embodnnent, be assisted by 
transfbimation with a vector that functionally inserts itself into the host 
chromosome. Vectors capafael of chromosomal insertion include, for 
example retroviral vectors, transposons or other DNA elements which 
10 promote integration of DNA sequences in chromosomes, especially DNA 
sequence homologous to a desired chromosomal insertion site. 

Cells that have stably integrated the introduced DNA into their 
chromosomes are selected by also introducing one or more markers that 
allow for selection of host cells which that the desired sequence. For 
15 example, the marker may provide biodde resistance, e.g^ resistance to 
antibiotics, or heavy metals, such as copper, or the like. The selectable 
marter gene can either be directly linked to the DNA gene sequences to 
be expressed, or introduced into the same cell by co-transfection. 

In another embodiment, the introduced sequence is incorporated 
20 into a plasmid or viral vector capable of autonomous replication in the 
redpient hosL Any of a wide variety of vectors may be employed for this 
purpose, as outlined below. 

Factors of importance in selecting a particular plasmid or viral 
vector include: the ease with which recipient cells that contain the vector 
25 may be recognized and selected from those recipient cells which do not 
contain the vector; the number of copies of the vector which are desired 
in a particular host; and whether it is desirable to be able to "shuttle" the 
vector between host cells of different species. 

Preferred eukaryotic plasmids include those derived from the 
30 bovine papilloma virus, vaccinia virus, and SV40. Such plasmids are well 
known in the art and are commonly or commercially available. For 
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example, mammalian expression vector systems in which it is possible to 
cotransfect with a helper virus to amplify plasmid copy number, and, inte- 
grate the plasmid into the chromosomes of host cells have been described 
(Perkins, A.S. et oL, Mol Cell BioL 3:1123 (1983); Clontech, Palo Alto, 
5 California). 

Once the vector or DNA sequence containing the construct(s) is 
prepared for expression, the DNA construct(s) is introduced into an 
appropriate host cell by any of a variety of suitable means, including 
transfection, electroporation or delivery by liposomes. DEAE-dextran, or 

10 calchim phosphate, may be useful in the transfection protocol. 

After the introduction of the vector in viiro^ recipient cells are 
grown in a selective medium, that is, medium that selects for the growth 
of vector*containing cells. Expression of the cloned gene sequence(s) 
results in the production of the heterologous protein. 

15 According to the invention, this expression can take place in a 

continuous manner in the transformed cells, or in a controlled manner. 

If desired, in in vitro culture, the expressed protein is isolated and 
purified in accordance with conventional conditions, such as extraction, 
precipitation, chromatography, affinity chromatography, electrophoresis, 

20 or the like. 

TTie vectors obtained through the methods above, will provide 
sequences that, by definition, provide a transcriptional regulatory element 
of the invention (the serglycin promoter, and/or the enhancer element 
and/or the suppressor element). Such vectors may be designed with 

25 restriction enzyme sites that allow for the the insertion of a DNA 
sequence encoding a heterologous protein at a site or sites operably 
linked to the transcriptional regulatory complex (the promoter and any 
additional elements that alter promoter function). 

Using the techniques described above, cotransfection of rat 

30 fibroblasts wdth XMG-PGl and the selectable marker pSV2 neo resulted 
in the establishment of fibroblast cell lines that had integrated both 
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fordgn genes into their genome (Avraham ei uL, J. BioL Chenu 26^:16719 
(1989)). RNA blot analysis revealed that two of the rat fibroblast cell 
lines contained low, but detectable, levds of the LO-kb mRNA transCTipt 
that encodes mouse seiglycin. No other gene that encodes a proteoglycan 
5 pq)tide core has been isolated and sequenced in its entirely. Neither has 
one been inserted into a foreign cell 

The abiEty of the transfected rat fibroblasts to transcribe the 
foreign mouse gene indicates that some of the regulatory elements in the 
gene's promoter are pre^nt in the isolated mouse genomic clone. When 

10 the 5(M-bp 5' flanking region of the mouse serglycin gene was compared 
to fee corresponding 5' flanking region of the analogous human gene, a 
119-bp region that immediately precedes the transcription-initiation site 
was found to be nearly identical. This nucleotide sequence is more highly 
conserved in evolution than any similar sized region of the gene that is 

IS translated into protein. 

The 504-bp 5' flanking region of the mouse serglycin gene was 
linked to plasmid DNA that contains the structural sequences of the 
human growth hormone reporter gene, and the amount of growth 
hormone produced by different cell types transfected with the resulting 

20 placid construct quantified. With deletion analysis and site-directed 
mutagenesis, three motifs in the 5' flanking region of the mouse serglydn 
gene were identified that regulate its constitutive transcription. One of 
these elements suppressed transcription of the gene, whereas the other 
two elements enhanced its transcription. Due to the near identity of this 

25 5' flanking region in the two species, it is likely the same cty-acting 
elements are used by all mouse and human cells that express this proteo- 
glycan. As indicated by gel-mobility-shift assays, hematopoietic cells that 
transcribe the serglycin gene possess /ra/tr-acting factors in their nuclei 
that recognize these elements, and a different profile of £mn5-acting 

30 factors is present in fibroblasts that do not express the serglycin gene. 

Using the enhancer that resides between nucleotide residues -118 
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and -81 as an enhancer, more growth hormone was produced in 
transiently-transfected cells than with control plasmid DNA containing a 
generic promoter. Because this or-acting motif is one of the most potent 
enhancers now known for hematopoietic cells, it can be used as an 
5 efEe^ive tool to drive transcription (and thereby translation) of any 
foreign gene in hematopoetic cells. 

IV, Characterization of the Tm/ty-Acting Factors 

Transcription is regulated by ^rans-^cting factors that bind to 
distinct o^-acting elements usually located in the 5' flanking regions of 

10 genes, and these DNA-binding proteins can act in synergy to enhance 
traBscription or in an opposing manner to suppress transcription. As 
ass^sed by ge] mobiUty shift assays, rat basophilic leukemia-1 cells and 
rat-l fibroblasts contain a number of DNA-binding proteins in their nuclei 
that specifically bind the region of DNA that contains the serglydn 

15 suppressor c^-acting element, the serglycin enhancer cis-acting element, 
and the proximal element of the serglycin promoter region. Based on their 
similar mobilities in the gel mobility shift assays, rat basophilic ieukemia-1 
cells and rat-l fibroblasts contain a common /ra/tf-acting factor 
(®^(-250/-161)"0 ^^^^ binds to the suppressor element and a common trans- 

20 acting factor (B/F,^/^24)*') '^at binds to the proximal promoter. In 
addition, distinct /ram-acting factors are present in each cell line. Rat-l 
fibroblasts have distinct /ron^-acting factors that bind to the suppressor 
element (F(.250Ai6irI0' *e enhancer element (Fj.jjgyLfii)-!) and the 
proximal promoter (Fj^/+ 24,-11), whereas rat basophilic leukemia-1 cells 

25 have distinct trans-^cling factors that bind to the enhancer element 

(J^i-nmiy^) proximal promoter (B(^,^24)-")- 

Stably transfected rai-1 fibroblasts that have incorporated 10-20 
copies of the mouse genomic clone X-MG-PGl into their genome 
constitutively express low levels of the 1.0 kb serglycin transcript Based 
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on the transient transfections described herein, one reason nonnal 
fibroblasts contain no serglydn mRNA, and transfected fibroblasts contain 
onfy limited amounts, is due to the presence of the £wm^-acting factor of 
the invention in these mesenchymal cells, such mmr-acting factor being 
5 veiy ^fective in suppressing transcription of this gene. 

A compute search using the "Dynamics" program (Ghosh, D., 
NudacAdds Res. 15:1749-1756 (1990)) failed to reveal a conserved as- 
acting element within residues -250 to -118 of the mouse and human 
seiglydn gene that is recognized by a known suppressor DNA-binding 

10 protein, supporting the novelty of the ciy-acting element present in this 
r^on of the seigtycin gene. Because fibroblasts are more effective than 
rat basophilic leukemia-1 cells in their use of the element that resides 
between residues -250 and -190 to suppress transcription of the mouse 
serglycin gene, the re^onsible (ra/ir-acting factor may be more abundant, 

15 selectively e3q)ressed, or post-translationally modified to be more active. 
As assessed by the gel mobility shift assays with the residues -250 to -161 
probe, the nuclear extracts of rat-1 fibroblasts contained at least one trans- 
actmg factor that was not recognized in rat basophilic leukemia-1 cells. 

V. Uses of the Invention 

20 Although bacteria, yeast and insect cells often can be transfected 

with fiDreign cDNAs or genes to obtain biologically-active recombinant 
proteins, in many situations it is necessary to express a protein in a 
mammalian cell so that it can be property modified post-translationally. 
Some of the easiest cells to maintain in culture and to transfect with 

25 foreign DNA are immature hematopoetic cells such as rat basophilic 
leukemia-1 cells, mouse WEHI-3 monocytic cells, and mouse P815 
mastoc^oma cells. All three of these cell lines have their own spectrum 
offrany-acting DNA-binding proteins thai bind to the cty-acting elements 
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of those genes that the cells are programmed to express. Thus, in order 
to obtain maximal expression of a foreign gene in a transfected cell, one 
must use the regulatory elements of a gene that is expressed in abundance 
in that cell type. Serglycin mRNA is expressed in abundance in a large 
number of immature mouse, rat, and human hematopoietic cells that are 
easy to maintain in culture. The cis-acting elements of the invention, in 
the 5' flanking region of the mouse serglycin gene can, for the first time, 
be used be used to drive transcription and translation of a foreign gene 
in transfected rat basophilic leukemia cells and WEHI-3 monocytic cells. 

Thus, the invention is useful for the expression of any protein in 
a mammalian, and especially a hematopoietic, cell system, especially any 
protein that requires the mammalian environment for po$t*translational 
modifications, including glycosylation. Proteins of interest that may be so 
expressed include hormones, such as insulin and growth hormone, other 
peptide growth fectors, cytokines, interferons, interleukins, en^mes, 
structural proteins, albumin, actin, etc., and especially c-kit ligand, 
granulocyte-macrophage colony stimulating factor, interferon-7, IL-1, IL-3, 
IL-4, ILr9, IL-10, nerve growth factor, and transforming growth factor-j3. 

Many varieties of transcriptional control may be provided to a 
heterologous gene by the regulatory elements of the invention. In a host 
cell of hematopoietic cell origin, for example, if a genetic sequence 
encoding the 5' flanking region of the serglycin gene (for example, the 
proximal 504 bp of the 5' flanking region) is operably linked to a 
heterologous gene, such genetic sequence may be expected to express in 
a manner, and to a degree similar to that of the native serglycin peptide 
gene. 

Expression of a desired heterologous protein in a host cell of 
hematopoietic cell origin may be achieved by operably linking genetic 
sequence encoding the enhancer element of. the invention to a desired 
promoter sequence functional in such host cell, such element being 
located between nucleotides -118 and -81 of the serglycin gene, and such 
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element being dominant^ active to stimulate transcription of operably 
Gnked genes in hematopoietic cells. 

&pression of a desired heterologous protein in a host cell of 
hematopoietic cell origin may also be adiieved by introducing genetic 
sequences encoding the unique and atypical eukaiyotic promoter el^ent 
op^bly linked to the coding sequence of the desired heterologous 
protein, such promoter element being located between nudeotides -40 and 
-20 of tiie serglycin gene, and such element being dominantly active for 
the promotion of transciption of operably linked genes in hematopoietic 
cells. 

Expression in fibroblast hosts may be modified such that a desired 
gene that overejq)resses an undesired protein in a fibroblast host may be 
"turned off by introducing genetic sequences encoding the suppressor 
element of the invention, located between nucleotides -250 and -190 of 
flie serglycin gene, on an integrating or viral vector that inserts such 
element into the transcriptional regulatory region of the gene. 

For example, mast cell-derived glycosaminoglycans such as heparin 
and chondroitin sulfate di-B have potent biologic activiues in different 
clinical situations. Unfortunately, prior to the invention, it has been 
difficult to obtain these glycosiminoglycans in sufficient quantity for 
analysis- Because of this problem, the biologic activities of mast cell- 
derived chondroitin sulfate E and chondroitin sulfate D have not even 
been tested. The ability to culture mast cells and, according to the 
invOTtion, to alter the consUtuents of the mast cell's secretory granule 
using recombmant cytokines under the transcriptional control of the 
regulatory elements of the invention. Each cytokine or other desired 
&ctor may be examined for its ability to induce the pofymerization of a 
specific type of glycosaminoglycan onto serglycin. Thus, mast cells may be 
induced to potymerize a specific type of glycosaminoglycan onto serglycin 
in response to specific recombinant cytokines, and, for the first time, the 
culture scaled up to obtain large amounts of the glycosaminoglycan of 
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interesL Alternatively, using recombinant technology an animal may be 
genetically altered such that it can be induced to express large numbers 
of that particular mast cell that contains the desired glycc^aminogfycan* 
Having now generally described the invention, the same will 
become better understood by reference to cenain specific examples which 
are included herein for purposes of illustration only and are not intended 
to be limiting unless other wise specified. 

EXAMPLES 

Example 1 

Construction and Screening of a 0 cDNA Library 

The promyelocytic leukemia cell line, HL^O, is a transformed 
human cell that synthesizes chondroitin sulfate proteoglycans and stores 
these proteoglycans in its secretory granules. Under certain in vitro 
conditions, this cell can be induced to differentiate into cells that resemble 
neutrophils, monocytes, macrophages, eosinophils, and basophils. HL-60 
cells (line CCL 240; American Tissue Type Collection, Rockville, 
Maryland, USA) were lysed in the presence of guanidine isothiocynate 
(BRL, Gaithersburg, MD). and total RNA was purified by the CsCl 
density-gradient centrifugalion technique of Chirgwin ei al,. Biochemistry 
18:529A (1979). The poly (A)"^ RNA that was obtained by oligo (dT)- 
cellulose (Collaborative Research, Waltham, MA) chromatography (Aviv, 
K., and Leder, P., Prfx-. Nod Acad Sn\ USA ^9:1408 (1972)) was 
converted into cDNA (Okakajama, H., and Berg, P., MoL CeUBioL 2:161- 
170 (1982)). TTie resulting cDNAs were blunt ended with T4 DNA 
polymerase (Biolabs, Beverly, MA), the internal EcoRl sites methylated, 
and the cDNAs ligated to EcoRl poly-linkers. After selection of cDNAs 
of >500 bp by Sepharose CL-4B (Pharmacia) chromatography, the 
cDNAs were ligated to depliosphon/lated XgtlO DNA. Escherichia coli 
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(strain C600 HQ) were infected with the resulting recombinant 
bacteriophages resulting in a h*brary with a complexity >1 x 10^ The HL- 
60 cell-derived cDNA library was probed at 37X with [of-%]dCIP (3000 
O^ol; New England Nuclear, Boston, MA) nick-translated pPG-l in 
5 hybridization buffer (50% formamide, 5X SSC (0.15 M NaCl/15 mM 
sodium citrate), 2X Denhardfs buffer, 0.1% sodhim dodecyl sulfate 
(SDS), 1 mM EDTA* 100 figfml salmon sperm DNA carrier, and 10 mM 
sodnim phosphate). The filters were washed at ST^'C under conditions of 
low stringency of l.OX SSC, 0.1% SDS, 1 mM EDTA, and 10 mM sodium 

10 phosphate, pH 7,0. Approximately 500,000 recombinants in the library 
were plated to isolate the clone designated cDNA-H4 (Figure 1). The 
HLr60 cell-derived cDNA library (==: 500,000 recombinants) were re- 
screened using CDNA-H4 as the probe. Thirty clones that hybridized 
under conditions of high stringency (55*^0; 0.2XSSC, I mM EDTA, 0.1% 

15 SDS, and 10 mM sodium phosphate, pH 7.0) with cDNA-H4 were 
isolated from the secondary screening of the library. 

TTie individual HL-60 cell-derived cDNAs and their subcloned 
fiagments were inserted into M13mpl8 and M13mpl9 (Amersham, 
Arlington Heights, IL) and sequenced by the dideoj^ chain termination 

20 method of Sanger et al., Proc NatL Acad. ScL USA 74'MSi (1977). Both 
strands of cDNA-H4 were sequenced. The sequencing strategy is 
presented in Figure 1. The consensus nucleotide sequence of the HL-60 
derived secretory granule proteoglycan peptide core cDNAs is shown in 
Figure 2. 

25 A 249 bp EcoK\->EcoK\ fragment has been isolated from a 

B:oBl digest of cDNA-H19. Tlie nucleotide sequence of this fragment 
(Table I) contains the sequence expected for a polyadenylation site 
(underlined) and thepoly(A)"*' tail (underlined). This fragment hybridizes 
to a genomic fhigment that encodes the gene for this proteoglycan 

30 peptide core, and thus probably represents the next 249 bp of the 
transcripL 
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Table I 

Consensus nucleotide sequence of the 3' end of the cDNA that 
encodes serglydn, the peptide core of the HL-60 cell senetoiy granule 
proteoglycan (Avraham, S. et aL, Proc. Natl. Acad. ScL USA 56:3763-3767 
5 (1989)). 

GAATTCTTAA-AGGATTATGC-TTTAATGCTG-TTATCTATCT- 
TATTGTTGTT-GAAAATACCT-GCATTTTTTG-GTATCATGTT- 
CAACCAACAT-CATTATGAAA-TTAATTAGAT-TCCCATGGCX:- 
ATAAAATGGC-TTTAAAGAAT-ATATATATAT-TnTAAAGTA- 
10 GCTTGAGAAG-CAAATTGGCA-GGTAATATTT-CATACCTAAA- 
TTAAGACnx:T-GACTTGGATT-GTGAATTATA-ATGATATGCC- 
CCl'l l lUl IA-TAAAAACAAA-AAAAAAATAA-T [SEQ ID No. 1] 

Example 2 

Chromosomal Localization of the Human Serplvcin Gene 

15 For the chromosome localization of the human gene that encodes 

CDNA-H4, DNA from five different human/mouse (lines 13C2, 24B2, 
1711, 462TG, and 175) and 12 different human/hamster (lines 35A2, 
35A4, 35B5, 35C1, 35D3. 35D5, 35E4. 35F1, 35F3. 35F5. 89E5. and 95A4) 
somatic cell hybrids were digested with BamHl. The resulting fragments 

20 were resoh'ed by agarose gel electrophoresis, and the DNA blots were 
analyzed under conditions of high stringency using cDNA-H4 as a probe. 
The percent discordance of the cDNA-H4. probe to each human 
chromosome was determined as described in Table II; a discordant 
fraction of 0.00 indicates that, in HL-60 cells, the serglycin gene is located 

25 on chromosome 10. 
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Table II 

Segregation Pattern of cDNA-H4 with DNA from 
Human/Rodent Somatic Cell Hybrids 

Hie DNA from different human/hamster and human/mouse 
5 somatic hybrid cell lines and the DNA from the controls were anafyzed 
for their hybridization to cDNA-H4. The column designations are: +/+, 
both hybridization to cDNA/cDNA-H4 and the specific human 
chromosome are present; -A, hybridization to the cDNA-H4 and the 
chromosome are both absent; +/-^ hybridcation is present but the 
0 chromosome is absent; and -/+, hybridization is absent but the 
chromosome is present For calculation of the discordant fraction for 
eadi chromosome, the sum of the +/- and -/+ columns are divided by the 
sum of the -/-, +/% and columns. The 19q+ category represents 
the der 19 translocation chromosomes for the hybrid clones derived from 
5 fiisions with leukocytes from the two different X/19 translocation carriers- 
Tfae X and Xq- categories represent the intact X and the der X 
translocation chromosomes. Bruns ei uL, Biochem GeneL 77:1031-1059 
(1979), 
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Example 3 

Identification of Nucleotide Sequences 
* in the Human Genome that Encode Serglvcin 



TTie rat L2 cell-derived cDNA, pPG-1, disclosed iii Bourdon ei oL, 
10 ' I^oa Nad, AcadL ScL USA 82:1322 (1985) was used to identify the 
genomic fragments encoding human serglycin from a BamHl digest of 
human genomic DNA. While no hybridization occurred when the DNA 
blot was probed under conditions of high stringency with either pPG-1 or 
pPG-M, or probed under conditions of low stringency with pPG-M, at 
15 least 10 DNA fragments were visualized when the blot was probed under 
conditions of tow stringency with pPG-1. The large number of DNA 
fragments detected suggested that there was a multi-gene family in the 
human which contained repetitive sequences similar to those which 
encode the serine-glycine repeat region of the L2 cell proteoglycan 
20 peptide, 

Example 4 

Isolation and Characterization of the Human Serglvcin Gene 

Subcloned fragments of the HL-60 cell-derived proteoglycan 
cDNA, CDNA-H4. were radiolabeled with ja-'^-PldCTP (3000 Ci/mmol; 
25 DuPont-New England Nuclear) to a specific activity of >10^ cpm/^g by 
either nick translation (Maniatis, T, ei aL, Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, 
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New York, pp. 109-114 (1985)) or random priming (Feinberg, A.P. et oL, 
AnaL Biochem 166224*229 (1983)), and then were used to screen «10^ 
recombinants in a EMBL3 human genomic h"braiy (Klickstein, L.B. et oL, 
L Exp. Med. 1^5:1095-1112 (1987)) by plaque hybridization. 
5 Nitrocellulose filters (MiHipore, Bedford, MA) were probed at 42*^0 in 
50% formamide, 0.75 M NaCl, 75 mM sodium citrate, 5X Denhardrs 
faufEer, 0.1% SDS, 1 mM EDTA, 100 fig/m\ sabnon spenn DNA carrier, 
and 10 mM sodmm phosphate. The nitrocellulose filters were washed at 
55**Cwith 30 mM NaCI, 3 mM sodium citrate, 0.1% SDS, 1 mM EDTA, 

10 and 10 mM sodium phosphate, pH 7.0. Several independent clones were 
obtained using the entire 650 bp HL-60 cell-derived cDNA, cDNA-H4 
(Figure 2). However, in order to obtain better representation of the 5' 
flankmg region of the gene, the human genomic libraiy was rescreened 
using the 136 bp 5'->/iQ3nI fragment of cDNA-H4 to isolate 2 additional 

15 clones. The restriction maps of the clones were determined by incubating 
samples of their DNA separately wiaiy4ccl, BamHl, EcoRl, Hindlll, Kpnl, 
or Sail (New England Biolabs, Beverly, MA). The digests were 
electrophoresed in 1% agarose gels, and the separated DNA fi^gments 
were transferred to Nytran membranes (Schleicher and Schuell, Keene, 

20 NH) (Southern, EJ4., /. MoL BioL 9^^:503-517 (1975)). The resulting 
DNA blots were probed with specific 5' {5'->Kpni and 5'-'>Xmn\) and 
3' (X>miI->3' and >4ccI->3') fragments of cDNA-H4. 

Nucleotide Sequence Analysis of the Human Serglvcin Gene. 
Human genomic fragments (Figure 3) were subcloned into the Bluescript 
25 (Stratagene, La Jolla, CA) plasmid vector using double enzyme polarized 
sho^n ligations to improve the efficiency of recombination and to 
maintain the orientation of the subclones (Kurtz, D,T. ei oL, Gene /J:145- 
152 (1980)). Recombinant transformants were identified by colony 
hybridization and were restriction mapped by the same method as that 
* 30 u^ above for the phage clones. Double stranded DNA sequencing 



-47- 



(Sanger, F. et oL, Proa NaiL AcadL ScL USA 7-^:5463-5467 (1977); Zhang, 
H. ^ aL, Nua Add, Res, /d: 1220-7 (1988)) was performed directly on the 
plasmid subclones using a Sequenase nucleotide sequencing kit (United 
States Biochemical, Cleveland, OH) and [a-^SJdATP (1000 Ci/mmol; 
Amersham). Universal oligonudeotide primers (SK, KS, T3, T7 and 
MlSrev; Stratagene) were used to determine the sequence of the first and 
last 300 nucleotides in each the subdoned fragment Based on the 
nudeotide sequences of the sense strand and the antisense strand of the 
genomic fragment, two oHgohucleotides that were each 18 nucleotides in 
length were synthesized on an Applied Biosystems 380A Oligonucleotide 
Synthesizer at the Harvard Microchemistry Facility, Cambridge, MA. 
These oligonudeotides were then used as primers to detennine the 
contiguous nudeotide sequence of the next 200-250 nudeotides in each 
direction of the double stranded DNA. Additional oligonudeotides 
complementary to different regions of the cDNA were used as primers to 
c^nd the sequence from the exons in both directions. Nudeotidt 
sequence data was enter^^d and edited on . : IBM-PC using the Clatech 
molecular biology soft^ ^ar^ package. Dai^ ase searches and homology 
comparisons with the r.«cjse serglycin .: .le and other genes were 
performed using the ccniruters at the Molecular Biology Computer 
Research Resource at Dar-a-Farber Cancer Institute, Beaton, MA. 

When a human ^^enomic library was screened using the entire 650 
bp CDNA-H4 probe, six independent clones were isolated (designated as 
XHG-PGl to XHG-PG6). Restriction mapping of these clones revealed 
that all sbc of the clones lacked a 5' 12 kb EcoR\->EcoRl fragment and 
failed to hybridize to a 136 bp 5'">A/?/iI fragment of cDNA-H4. 
Rescreening of the genomic library v-ith the 5' fragment of cDNA-H4 
resulted in the isolation of two additional clones that were designated as 
XHG-PG7 and XHG-PG8, respectively. When analyzed by restriction 
mapping, clones XHG-PG6, XHG-PG7, and XHG-PG8 contained 
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gene which racodes human secglycin. 

Detailed restriction mapping of these subclones revealed that this 
gene spans at least 16.6 kb and consists of 3 exons (Figures 3 and 4). 
5 Between the first and second exon is a 8.4 kb intron, and between fte 
second and third exon is a 6.7 kb intron. Both introns begin with the 
mideotide sequence "GTAAG" and end with the sequence "CAG". 
Analysis of the nucleotide sequence of this gene revealed that exon 1 
encodes the 5' untranslated region of the mRNA transcript and the entire 

10 27 amino add hydrophobic signal peptide of the translated molecule. 
Exon 2 encodes a 49 amino acid portion of the peptide core (amino acid 
residues 28 to 76) which would be predicted to be the N-terminus of the 
molecule after the hydrophobic signal peptide is removed in the 
endoplasmic reticulum. Exon 3 (634 bp) is flie largest exon and encodes 

15 the remaining 82 amino acids of the translated molecule and the entire 3' 
untranslated region of the mRNA transcript These 82 amino adds 
encode a 17 amino add sequence (residues 77 to 93) that immediately 
precedes the glycosaminoglycan attachment region, the 18 amino add 
serine-glydne rich region (residues 94 to 111), the C-terminus of the 

20 translated molecule (residues 112 to 158). 

Determination of the Transcription-Initiation Site of the Human 
Serp^cin Gene. A SI nuclease mapping analysis was performed to 
identify the transcription-initiation site of the human gene that encodes 
the peptide core of serglycin proteoglycans in HL-60 cells. A 4 kb Sail - 

25 -> Hindlll fragment of the genomic clone XHG-PG7 was subcloned into 
Bluescript (designated pB5SH3). An oligonucleotide (S'-XTFTGAACTG 
AGGATrCCAGAA->3'[SEQ ID No. 2]) was synthesized that 
corresponded to the residues 89 to 110 of the antisense strand of cDNA- 
H4. Ten nanograms of this oligonucleotide were hybridized to 4 /ig of 

30 alkali-denatured pB5SH3, and a complementary strand of DNA was 
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synthesized under conditions similar to that described above except that 
it was labeled with [a-%ldATP. A 400 bp antisense DNA probe was 
isolated following electrophoresis of the synthetic product on a denaturing 
8 M urea^% polyacrylamide geL TTie single-stranded radiolabeled DNA 
S fragment was identified by autoradiography, electroeluted, and ethanol 
precipitated. A 50,000 cpm sample of the radiolabeled DNA fragment 
was hybridized to « 15 fig of HL-60 cell-derived total RNA (Chirgin, JM. 
et oL, Biochemistry 75:5294-5299 (1979)) or 1 fig of HL-60 cell-derived 
poIy(A)"*" RNA (Aviv, H. et ai, Proc Nail. Acad. ScL USA <59: 1408-1412 

10 (1972)) at48'*C for 16 h in 80% formamide, 400 mM NaCl, 1 mM EDTA, 
and 40 mM Pipes (pH 6.4). The ^^P-DNA/RNA hybrid was incubated 
with 100 U of SI nuclease (Pharmacia) for 60 min. At the end of the 
reaction, the sample was extracted with phenol, and ethanol precipitated 
at -80*C Three microliters of 1 mM EDTA and 10 mM Tris-HCl (pH 

15 8.0) and 4 ^1 of formamide loading buffer were added to the precipitated 
sample. The sample was boiled, and loaded onto a 8 M urea/8% 
polyacrylamide sequencing gel along side a digest of ^^P-labeled pBR322 
(New England Biolabs) and a sequencing ladder of pBSH3 that had been 
primed with the same oligonucleotide. For two negative controls, SI 

20 nuclease reactions were concurrently preformed with 15 ^g of tRNA 
(Bethesda Research Labs) or MBBC (Razin. E. et ai, 7. BioL Chem, 
257:7229-7236 (1982)) total RNA. 

HL^O cell-derived total RNA and poly(A)''" RNA protected 132 
nudeotides of the probe from degradation by SI nuclease. Therefore, it 

25 was concluded that the putative transcription-initiation site in HL-60 cells 
for this gene resided 53 bp upstre-am of the translation-initiation site. The 
P-labeled 5' antisense 400 bp DNA fragment was not protected if it was 
incubated with tRNA or mouse mast cell RNA prior to exposure to SI 
nudease. This deduced transcription-initiation site in HI^60 cells 

30 corresponds to the deduced transcription-initiation site of the analogous 
gene that is expressed in BMMC-derived mast cells and rat basophilic 
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leukemia cdls (Bourdon, MJL el al., MoL Cell BioL 733-40 (1987)), but 
not in rat L2 yolk sac tumor cells (Bourdon. iVLA. el oL, MoL Cell BioL 
733-40 (1987))- 

Hgure 3 is a restriction map of the human gene. Figure 4 is the 
5 nucleotide sequence of the gene that encodes human sergtydn including 
5' flanMng and intron sequences. 

Qonmp of the Mouse Serglvcin Gene, A 15 kb mouse genomic 
fragment containing the gene that encodes the mouse serglycin was cloned 
by screening a mouse genomic library derived from a SauSAl digest of 

10 BALB/c mouse liver DNA (Avraham, S., ec aL, Proa NatL Acad. ScL USA 
86'3763'3767 (1989)), using a [a-%]dCIP labeled 450 bp /lccI->3' 
gene^specific fragment of a bone marrow-derived mast cells cDNA 
(cDNA-M6) that encodes the peptide core of mouse secretory granule 
proteogfycan using methods as described above. The nucleotide sequence 

15 and the deduced amino add sequence of this gene is presented in Figure 
5. 

Neither the human nor the mouse gene have a classical TATA box 
(Breathnach, R.etaL, Ann. Rev. Biochem. JO J49-383 (1981)) or GC-rich 
element (Sehgal, A. ei al,, MoL Cell BioL df:3160-3167 (1988)) =^30 bp 

20 upstream of its transcription-initiation site. TTierefore, theserglydn gene 
that is expressed in hematopoietic cells has an unusual promoter. The 5' 
flanking region has not been described for any other human proteoglycan 
peptide core gene, and thus comparisons with genes that encode other 
proteoglycan peptide cores cannot yet be made. Of importance is the 

25 finding that 96% of the nucleotides that are present in a 119 bp 
nudeotide sequence just upstream of the transcription-initiation site of the 
himian (residues -1 to -119) and mouse (residues -1 to -123) gene are 
identicaL This degree of conservation greatly exceeds that obtained when 
any other 119 bp region within the exons of the gene in these two species 
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is compared, and suggests that important cis acting regulatory elements 
are present in this consented nucleotide sequence. 

Example 5 

Stable -Transfection of Rat-1 Fibroblasts 
with the Mouse Serglvcin Gene 

Fisher rat-1 fibroblasts were grown in Dulbecco*s modified essential 
medium (DMEM) supplemented with 10% fetal calf serum, 2 mM 
glutamine, 100 U/ml of penicillin, and 100 /ig/ml of streptomycin, at37''C 
in a humidified atmosphere of 5% COj. DNA cotransfections were 
performed essentially as described elsewhere (Southern, P J. ei oL, J, MoL 
j4ppL Gen. 7:327-341 (1982)). In brief, 3-4 x 10^ rat-l fibroblasts were 
placed into eacn 10-cm plastic culture dish containing DMEM for 12-24 
h before cotransfection with the mouse genomic clone XMG-PGl and the 
selectable maker pSV2 neo. A calcium phosphate/DNA precipitate was 
created by adding 0^ ml of a 250-mM solution of calcium phosphate 
containing 5 fig of the XMG-PGl DNA and 0.5 fig of pSV2 neo drop-wise 
in the presence of bubbling air to 0.5 ml of 280 mM NaCl, 10 mM KCl, 
12 mM dextrose, 1.5 mM sodium phosphate, and 50 mM HEPES (pH 
7.1). TTie precipitate that formed after a 30 min incubation was added to 
a culture dish of fibroblasts, and 10 to 18 h later, the DNA precipitate was 
remm'ed. The transfected cells were washed twice with growth medium 
and then were allowed to recover for 24 h before being trypsinized and 
split at a ratio of 1:6. The resulting fibroblasts were plated into new 10- 
cm plastic dishes and cultured for 2 to 3 wk in DMEM containing 500 
/xg/ml gentamicin (Gibco); the culture medium was changed every 3 days. 
At the end of this period, gentamicin-resistant colonies of transfected 
fibroblasts were individually picked with cloning cylinders and grown as 
cell lines in culture medium containing 100 ^g/ml gentamicin. 
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RNA and DNA Blot Analysis of RaM Fibroblasts Stably 
Transfected with the Mouse Sergivcin Gene. Total RNA was prepared 
from mouse bone marrow-derived mast cells, rat-1 fibroblasts, and 
transfected rat-1 fibroblasts by a guanidinium thiocyanate method 
5 (Chngin, JM. et oL, Biochemistry 18:5294-5299 (1979); Glisin, V. et oL, 
Biodiemistry 75:2633-2637 (1974)). RNA (5 ^g/lane) was electrophoresed 
in 1% fonnaldehyde-agarose gels, and transferred to Zetabind (Thomas, 
P^, Proa Nad. Acad ScL USA 77:5201-5205 (1980)). The resulting RNA 
blots were incubated at 42*C for 24 h in hybridization buffer containing 

10 a radiolabeled yla:I->3' fragment of cDNA-M6. The blots were washed 
under conditions of high stringency, and autoradiography was performed. 
The mouse serglycin probe was removed from the blots by high 
temperature washing and the blots were reprobed with an actin cDNA to 
quantitate the amount of mRNA that had been loaded in each lane* 

15 DNA was isolated (Blin, N. ei oL, NucfeicAdds Res, J;2303-2308 

(1976)) from the mouse liver, rat liver, rat-1 fibroblasts, and transfected 
rat-1 fibroblasts, and samples were digested (10 /ig/digest) separately with 
25nnl, Bamm, Bglll, Sspl, Sau3Al, HindWl, or EcoRl for 4 h at 37*'C. 
The fragments were resoh^ed by agarose gel electrophoresis and were 

20 transferred ta Zetabind. The resulting DNA blots were analyzed for 
hybridization under conditions of high stringency with the /IccI— >3' 
firagment of the mouse cDNA-M6 as a probe. 

Expression of the Mouse Gene that Encodes the Peptide Core of 
Mouse Sergivcin Gene in Transfected Rat-1 Fibroblasts. To demonstrate 

25 that XMG-PGl contained the entire mouse serglycin gene, including its 
promoter region, and that this mouse genomic clone could be expressed 
in another mammalian cell, rat-1 fibroblasts were cotransfected with 
XMG-PGl and the dominant neo-resistant selectable marker encoded by 
the plasmid pSV2 neo. Seventeen independent clones of neo-resistani 

30 transfected rat-1 fibroblasts were isolated and were expanded separately. 
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Total RNA was isolated from bone marrow-derived mast cells, neo- 
transfected rat-1 fibroblasts, and the cotransfected rat-l fibroblast cell 
lines. 

The gene*specific serglycin probe failed to hybridize to any 
5 transcript in RNA blots of non-transfected fibroblasts; however, it did 
hybridize to a 1.0-kb RNA transcript in mouse bone marrow-derived mast 
cells and in two of the cotransfected rat-1 fibroblast cell Imes. Primer 
extension analyses were performed using RNA from the transfecied 
fibroblasts to determine the transcription-initiation site. When RNA fi-om 
10 the transfected cells was used as an RNA template, « 80 nucleotides were 
extended onto the oligonucleotide primer that corresponded to residues 
78 to 98 of CDNA-M6, resulting in a DNA product of about 100 
nudeotides in length. A DNA product of «60 nucleotides was obtain 
when the alternative primer that corresponded to residues 39 to 59 of 
15 cDNA-M6 was used in the assay. 

Genomic DNA was prepared from the above two clones of 
transfected rat-l fibroblasts, and was digested with fifg/II, Xmnl, Sail, or 
BamHl. DNA blots of the digests were probed with the /lccI->3' gene- 
specific fi^gment of cDNA-M6 to demonstrate that these transfected rat-1 
20 fibroblast cell lines contained mouse serglycin genomic sequences. The 
mouse proteoglycan probe hybridized to a 2.7-kb ft-agment present in the 
B^l digest of mouse live DNA, and to a 7.5-kb fi^gment in the jB;g/II 
digests of both rat liver DNA and rat-1 fibroblast DNA. Tlie transfected 
fibroblasts differed from the. non-transfected rat-1 fibroblasts in that they 
25 contained both the 2.7-kb and the 7.5-kb DNA fragments. Based on the 
relative intensity of hybridization of the gene-specific probe to the 2.7-kb 
fi^gment present in the 5^/11 digests of equal amounts of mouse liver 
DNA and fibroblast DNA, the fibroblast cell lines may have incorporated 
10-20 and 2-3 copies, respectively, of the mouse serglycin gene into their 
30 genome. 



Transfections have been performed in Chinese hamster ovary cells 
wifli a cDNA that encodes the peptide core of the fibroblast-derived 
dennatan sul&te proteoglycan called decorin (Yamaguchi, Y.eiaL, Nature 
555:244-246 (1988)) and in COS-l cells with a cDNA that encodes the 
peptide core of the T-cell derived invariant chain proteoglycan that 
associates with la (Miller, J. ei aL, Proc Nad, Acad ScL USA 55:1359-1363 
(1988)), but no transfections have been reported using a genomic clone 
that contains an entire proteoglycan peptide core gene. 

Example 6 

Preparation of Antibodies to Peptides of the Amino Add Consensus 
Sequence Which Recognize Native HL-60 Cell Derived Glvcin 

Al6aminoacid peptide02 [Ser-Asn-Lys-Ile-Pro-Arg-Leu-Arg-Thr- 
Asp-Leu-Phe-Pro-Lys-Thr-Arg] (SEQ ID No. 3) was chemically 
synthesized, coupled to hemocyanin, and injected into a New Zealand 
White rabbit. This peptide corresponds to residues 64-79 of the translated 
molecule and was a region of the core that preceded the serine-glycine 
rich glycosaminoglycan attachment region. 

The induction of antibodies which specifically recognize the peptide 
core protein of human serglycin was tested as follows. The peptide (3 
mg) was coupled with 5 mg of Keyhole Limpet hemo(^anin (Sigma) in the 
presence of 0.25% glutaraldehyde, and polyclonal antibodies were raised 
to the coupled peptide in New Zealand White rabbits using standard 
immunization methodologies. Antibody titers in whole sera were 
measured using an enzyme linked immunosorbent assay (ELISA). Each 
microtiter well was incubated overnight at 4**C with 1 ftg of synthetic 
peptide in phosphate buffered saline. After the remaining protein binding 
sites in the wells were blocked by a 1 h incubation with 1% (w/v) bovine 
serum albumin (Sigma), the wells were washed with phosphate buffered 
saline containing 1% (w/v) Tween 20. Rabbit sera that was serially 
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diluted in phosphate buffered saline was added, followed by horseradish 
peroxidase-ronjugated goat anti-rabbit IgG (Bio-rad, Richmond, CA); the 
wdb were then assayed spectrophotometrically for development of the 
2,2'azmo-di-[3-ethy]-benzthiazoline sulfonate] dye (Boehringer-Mannheim, 
Indianapolis, IN). PeptideOl (Ser-Val-Gln-Gly-Tyr-Pro-TTir-Gln-Arg-Ala- 
Arg-iyr-Gta-Trp-Val-Arg) (SEQ ID No. 4] that corresponded to residues 
24 to 39 of the deduced amino add of cDNA-H4 was also synthesized and 
used in the ELISA to confirm the specificity of the rabbit antisera. Anti- 
peptide IgG was partially purified by ammonium sulfate precipitation 
followed by ion exchange chromatography. 

Anti-peptide IgG (-30 fig) was incubated with 100 fi\ of a 15% 
(w/v) suspension of the Protein A-Sepharose beads (Sigma) in RIPA 
buffer for 1 h at room temperature. The resulting Protein A-Sepharose- 
IgG complex was added to 1 ml of RIPA cell lysates containing 5 x 10^ 
cell equivalents of [^^S]methionine-labeled or f^^SjsuIfate-labeled HL-60 
cells that had been precleared by incubation for 24 h with Protein A- 
Sepharose alone and then for 24 h with Protein A-Sepharose^reimmune 
IgG. After a 18-24 h incubation at 4^*0 with Protein A-Sepharose/anti- 
peptide IgG, the beads were washed 3 times by centrifugation with 0.1% 
bovine serum albumin, 0.5% Tween 20, and 10 mM phosphate buffered 
saline (pH 7.2) containing either 10 mM unlabeled methionine or 
unlabeled sodium sulfate. The bound radiolabeled antigens were eluted 
by suspending the beads in 60 fi\ of Laemmli buffer and incubating for 5 
min at 95*^0. The eluates were electrophoresed in 15% SDS-PAGE gels, 
stained with Coomassie Brilliant blue, dried, and autoradiographed using 
Kodak XAR-5 film. 

In the ELISA, the antiserum gave half-maximal binding at an 
approximate 500 fold dilution (Figure 6). The anti-peptide 02 serum 
failed to recognize peptide 01 which corresponded to deduced amino acid 
residues .24 to 39 of the same cDNA (Figure 6). The preimmune sera 
also failed to react with the coupled peptide 02. When 1 ^1 of the 
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antisera was preincubated with 1 fig of peptide 02 for 60 minutes at 25*C, 
no immnnoreactrvity was detected in the ELISA. 

An ^G-enriched fraction of the anti-peptide 02 sera was used to 
determine if Protein A-Sepharose-bound antibodies would recognize the 
5 initiany-translated sergfydn and the mature proteoglycan. A prominent 
20,000 protein was specifically immunoprecipitated from lysates of 2 
min [^SJmethionine-Iabeled HL-60 cells, whereas both a 20,000 
protein and a macromolecule that barely entered the gel were specifically 
immunoprecipitated firom 10 min radiolabeled cells. After a 10 min pulse 

10 and a 5 min chase, the 20,000 [^^S}methionine-labeled protein was less 
apparent while the macromolecule was somewhat increased. The 
[^^SJmethionine-labeled macromolecule corresponded in size exactly to 
the [^SJsulfete-Iabeled proteoglycan that was precipitated after an 
ov^ight radiolabeUng of the cells with (^^SJsuIfate. Because 

15 pre(^itation was inhibitable by preincubation of the Protein A-Sepharose- 
nnmune IgG with I fig of the synthetic peptide 01, it was concluded that 
flie rabbit anti-^eptide 02 antibodies recognize the precursor and mature 
huuman glycin. 

As shown in Figure 7. the size of the immunoprecipitated peptide 
20 core protein was approximately 13,000 daltons, consistent with the size 
predicted by Stevens e£ aL Bioh Chem. 265:7287 (1988) for the peptide 
core which has lost its 27 amino add signal peptide. 

Example 7 
Isolation of Serglvcin Proteoglycans 

25 Himian serglydn proteoglycans can be isolated using common 

prot^ isolation techniques known in the art such as column 
chromatography, gel electrophoresis, affinity chromatography, or immuno- 
extraction techniques using the antibody described above. For example. 
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such proteoglycans may be extracted by the following procedure (Stevens, 
R.L, et aL, J. BioL Chem. 2<W: 14 194-14200 (1985)). 

Bone marrow-derived mast cells pellets are lysed by restispension 
for 30 s in 50 pA of 1% Zwittergent 3-12 containing protease inhibitors, 
followed by the addition of 235 ml of 4 M guanidine HCl (GnHCI) 
containing CsQ (density 1.4 g/ml). These detei^ent-GnHQ proteoglycan 
extracts are then pooled such that in a typical experiment 48 ml of extract 
is obtained from approximately 2 X lo' bone marrow-derived mast cells, 
of which 3 X 10^ are radiolabeled. The pooled extracts are centrifuged 
at 17°C for 48 h at 95,000 X g, and the gradients are divided in most 
ejqjeriments into two equal fractions termed D, (bottom) and D, (top), 
re^ectively. The distribution of chondroitin sulfate E proteoglycan in 
fractions from the CsCl gradient or from subsequent ion ejrchange or gel 
filtration chromatography is determined by suspending a sample of each 
fraction in 12.5 ml of Hydrofluor and quantitating ^^S or ^ radioactivity 
in the radiolabeled proteoglycan on a Tracer Analytic Mark III liquid 
scintillation counter. Protein is detected by the method of Lowry a tU., 
with bovine serum albumin as a standard or by optical density at 280 nm. 
Nucleic acids are detected at a wavelength of 260 nm. The bottom 
fraction of each CsCl gradient is placed in dialysis tubing of 50,000 M, 
cut-off and dialyzed at 4°C against 1 M NaCl for 24 h and then for an 
additional 24 h against 1 M urea containing 0.05 M Tris-HCl, pH 13. 
The dialysate is adjusted to 4 M in urea by the addition of solid urea and 
applied to a 0.8 X 29-cm column of DEAE-52 previously equilibrated in 
4 M urea, 0.05 M Tris-HCl, pH 7.8. The ion exchange column is washed 
with 35 ml of 4 M urea, 0.05 M Tris-HCl, pH 7.8, and the chondroitin 
sulfate E proteoglycan eluted with a 180-ml linear gradient of NaCl (0-1.0 
M) in the urea buffer at a flow rate of 4 ml/h. Two-ml fractions are 
collected, and the proteoglycan-enriched fractions, detected by monitoring 
a portion of the fraction for either ^^S or ^H radioactivity if the cells have 
been prelabelled. are pooled, dialyzed 48 h at 4''C against 0.1 M 
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NH4HCO3, and lyophilized. This material is redissolved in 100 ^ of 4 M 
GnHCI/OJ M sodium sulfaie/0.1 M Tris-HQ, pH 7.0, applied 10 a 0.6 X 
lOO-on column of Sepharose CL-4B in this same buffer, and eluted from 
the column at a flow rate of IS ml/h. One-half ml fractions are collected 
5 and analyzed for radioactivity and absorbance at 280 nm. The proteo- 
^ycan-containing fractions are pooled, dialyzed against O.l M NH4HCX>3 
and lyophilhed. 



Example 8 . 

Isolation and Protease-Resistance 
10 of HL-60 Cell Serglvcin Proteoglycan 



Radiolabelinfr of HL-60 Cells- HL-60 cells (line CCL 240; 
American Type Culture Collection, Bethesda, MD) were cultured in 
enriched medium [RPMI-1640 medium supplemented with 10% (v/v) fetal 
calf serum, 2 mM L-glutamine, 0.1 mM nonessential amino acids, 100 

15 U/ml of penicflUn, and 100 figfml of streptomycin (Gibco, Grand Island, 
NY)J at 37*C in a humidified atmosphere of 5% CO^ For pS] 
methionine-Iabeling, HL-60 cells were preincubated at a concentration of 
10^ cells/ml for 10 min in methionine-free, enriched medium containing 
dialyzed fetal calf serum. Approximately 500 fiCi/ml of p^Sjmethionine 

20 (129 Ci/mmol; Amersham, Arlington Heights, IL) was then added. The 
HL-60 cells were incubated for an additional 2 to 10 min at 37°C^ 
centrifuged in the cold at 120 x g, and washed at 4°C in enriched medium. 
In the pulse-chase experiments, HL-60 cells were (^^S|methionine-labeled 
for 10 min, were washed as above, and were resuspended in normal 

25 enriched medhim at 37°C for an additional 5 min. Aliquots of 5 x 10^ 
[^SJmethionine-labeled HL-60 cells were lysed in 1 ml of RIPA buffer 
[0.15 M NaCl, 1% deoxycholate, 1% Nonident P-40, 0.1% SDS, 10 mM 
N-ethylmaleimide, 2 mM phenylmethylsulfonyl fluoride, 10 mM NaF, and 
0-1 M Tris-HCl, pH 7Ji|, and imraunoprecipitates of the lysates were 
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analyzed by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) as 
described below. 

For [^SJsulfate labeling, HL-60 cells were incubated in enriched 
medimn containing 50 ^Ci/ml of pS]sulfate (-4000 CiAnmol; DuPont- 
5 New England Nuclear, Boston, MA) for 1 h at a density of 1 x 10^ 
cells;^! or for 18 h at a density of 2 x 10^ cells/ml. The radiolabeled cells 
were centrifuged at 4**C for 10 min at 120 x g, and 150 /il of 1% (w/v) 
zwittergent 3-12 (Calbiochem. San Diego, CA) containing 100 fig of 
chondroitin sulfate A (Miles Scientific, Naperville, IL) and 100 fig of 

10 heparin (Sigma, St. Lx>uis, MO) glycosaminoglycan carriers were added to 
each cell pellet followed by L35 ml of 4 M GnHCI, 0,1 M sodium sulfate, 
and 0.1 M Tris-HCI. A sample of each lysate and supernatant was 
chromatographed on Sephadex G-25/PD-10 columns (Pharmacia, 
Piscataway, NJ) to quantitate the incorporation of (^^SJsulfate into 

15 maCTomolecuIes. 

In order to isolate the [^^Sjsulfate-Iabeled HL-iSO cell serglydn 
proteoglycans, solid CsCI was added to the remainder of the cell lysates 
to achieve final densities of 1,4 g/ml. Following centrifugation for 48 h at 
-100.000 X g, the bottom 33% of each CsCI gradient was dialyzed 

20 sequentially against 0.5 M sodium acetate for 24 h and 0,1 M ammonium 
bicarbonate for an additional 24 h. The dialysates were lyophilized and 
redissolved in 0.4 ml of water. Samples of partially purified (^^SJsulfate- 
labeled proteoglycans were incubated for 30 min with or without 10 fig of 
Pronase (Calbiochem), and the digests were applied sequentially to a 0.8 

25 X 85 cm column of Sepharose CL-6B (Pharmacia) that had been 
equilibrated with 4 M GnHCI, 0.1 M sodium sulfate, 0.1 M Tris-HCI, pH 
7.2. As a control, samples of p^SJsulfate-labeled chondrosarcoma 
proteoglycans were analyzed in parallel for their susceptibility to Pronase 
by incubation of 1 fig proteoglycan in 50 ^1 Hanks' balanced salt solution 

30 at 37**C for 30 min with 5 fig Pronase. Pronase-sensitive chondrosarcoma 
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proteo^cans are extracellufar matrix proteins and are distinguished from 
secretoiy granule proteoglycans. 

No substantial change in the hydrodynamic size of the EIL-60 cell 
[^SJsuIfete-Iabeled serglydn proteoglycans was deteaed following Pronase 
5 treatment, whereas rat [^S)sulfiate-Iabeled chondrosarcoma proteoglycans 
were susceptible to degradation. TTiese results show that the proteogtycan 
was resKtant to Pronase digestion. 

Example 9 

Identification of Transcriptional Regulatory Elements of the Serglvcin 
10 Gene and the Trans-acting Proteins Tliat Bind To These Elements. 



A. Experimental Procedures 

1. CellLines — Rat basophilic leukemia-1 (RBL-1) cells (line 
CRL-1378, American Tissue Culture Collection (ATCC), Rodcville, MD) 
and mouse myelomonocytic WEHI-3 cells (line TIB-68; ATCC) are cell 

15 lines of hematopoietic origin that abundantly express the serglycin 
transcript, whereas Fisher rat-1 fibroblasts (obtained from RJ^ Weinberg, 
Whitehead Institute, Massachusetts Institute of Technology, Cambridge, 
MA) and mouse NIH/3T3 fibroblasts (line CRL 1658; ATCC) do not 
(Tantravahi et oL, Proa Nad. Acad Sd, USA «J:9207-9210 (1986)). Rat 

20 basophilic leukemia-1 cells, WEHI-3 cells, and the two fibroblast cell lines 
were grown in enriched medium (Dulbeccb's modified Eagle's Medium 
supplraiented with 10% fetal calf serum, 2 mM L-glutamine, 100 U/ml 
penidllin. and 100 ^g/ml streptomycin (GIBCO, Grand Island, NY)) at 
37**C in a humidified atmosphere of 6% COy Cells were split 1:4 every 

25 3 days. 

2. Plasmid DNA Constructs — With a polymerase chain 
reaction methodology (Saiki ei uL Sciaict 2J9:4S7-491 (1988)), various 
lengths of the 504-bp 5' flanking region of the mouse serglycin gene 
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(Avraham et ai, J, BioL Chem. 2«: 16719-16726 (1989)) that all extend 
upstream of residue +24 were obtained. DNA constructs were prepared 
by ligating the various DNA fragments into the HindllVXbal 
restriction*enzyme cloning sites of p^GH and pSV40-hGH. p^GH is a 
pUC12 plasmid that contains a promoterless hGH gene (Selden et oL, 
MoL CeUBioL 63113-3119 (1986)), and pSV40-hGH is a plasmid which 
contains the early SV40 promoter without its enhancer linked to the 
structural sequences of the hOH gene (Chung et al,, Proc. NaiL Acad. ScL 
USA «5:7918-7922 (1986); Sand el uL. / BioL Chem. 26^:1022-1026 
(1989))- A pUC12 plasmid that contains an enhancerless thymidine 
kinase promoter h'gated to the hGH gene (pTKGH), and a pUC12 
plasmid that contains both the enhancer and promoter of the mouse 
metaUothionein-I gene ligated to the hGH gene (pXGHS) (Selden ei oL, 
MoL Cell BioL 6:3173-3179 (1986)) were used as positwe control plasmids 
in the DNA transfections. As described for other cell types (Sarid et aL, 
J, BioL Chem. 264:1022-1026 (1989); Selden ei at,. Science 236:714-718 
(1987)), the latter well-characterized metallothionein-I-hGH fusion gene 
was used to optimize the DNA transfections and to normalize the 
efficiency of expression of hGH by the different cells. 

Relevant 21-mer oligonucleotides that span the mutation site were 
used to perform site-directed mutagenesis (Zoller el aL, DNA J:479-488 
(1984)) on three nucleotides in a plasmid (designated 
pPG(-504/+24)hGH) containing the 504-bp 5' flanking region of the 
serglycin gene. In these constructs, the adenosine at residue -28 was 
converted to a cytosine, the cytosine at residue -30 was converted to an 
adenosine, or the adenosine at -38 was converted to a guanosine. The 
oligonucleotides used in the polymerase chain reactions, the site-directed 
mutagenesis, and the gel mobility shift experiments described below were 
synthesized on a Cyclone Plus DNA Synthesizer (Milligen/Biosearch, 
Novato, CA). The relevant nucleotide sequences within the different 
plasmid constructs was verified by dideox\' sequencing of the plasmid 
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DNA, as described by Sanger and coworkers (Sanger ei oL, Proa N<uL 
Acad ScL USA 74:5463-5467 (1977)) with modifications essential for 
double-stranded DNA sequencing (Chen ei oL, DNA 4:165-170 (1985)). 

3. Transfectfon Experiments — Rat basophih'c Ieukemia-1 - 
5 cdls were transiently transfected with the plasmid DNA constructs with 
DEAE-dextran (Lopata e/ a/.. Nucleic Acids Res. 72:5707-5717 (1984)). 
The day before DNA transfection, the rat basophilic leukemia-1 cells (1 
X 10^/dish) were plated in 100-mm culture dishes. Immediately before 
transfection, cells were washed once with 3 ml of serum-free enriched 

10 medium, and then 25 ml of serum-free enriched medium containing 5 fig 
ofsupercoiled plasmid DNA complexed to 20 mgfml DEAE-dextran (M^ 
500,000 daltons) was added. The dishes were incubated for 4 h at 37*^0 
in a humidified atmosphere of 6% C02r the transfection solution was 
removed by aspiration, 3 ml of serum-free enriched medium containing 

15 10% dimethyl sulfoxide was added, and the cultures were incubated for 
2 min more at room temperature. TTie transfected cells were treated with 
dimethyl sulfoxide, washed twice with 4 ml of serum-free enriched 
medium, and then cultured in 10 ml enriched medium at 37*'C in a 
humidified atmosphere of 6% C02- No matter which DNA construct was 

20 used in the transfection^ 100 later each culture dish contained 
approximately 5 x 10^ rat basophilic leukemia-1 cells. 

Rat fibroblasts, mouse fibroblasts, and mouse WEHI-3 cells were 
transiently transfected with the plasmid DNA constructs using calcium 
phosphate (Avraham et aL, J. BioL Chem. 264:16719-16726 (1989); 

25 Southern et L MoL AppL Gen, /:327-341 (1982)). The cells (1 X 10^) 
were suspended in 2.5 ml of enriched medium, 0.2 ml of a 250 mM 
solution of calcium phosphate containing 5 fcg of supercoiled plasmid 
DNA was added in a drop-wise manner, and the cultures were incubated 
overnight at 37*'C in a humidified atmosphere of 6% CO2. The 

30 transfection media were removed the following day, and the cells were 
washed and cultured ai 37*^0 in 10 ml of fresh enriched medium in a 
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humidified atmosphere of 6% COy No matter which DNA construct was 
used in the transfection, 100 h later each culture dish contained 
approximately 4 x 10^ fibroblasts. Approximately 100 h after the 
transfection by either method, 0.1 ml samples of culture medium were 
removed, and the levels of hGH were determined with an immunoassay 
kit from Nichols Institute Diagnostics (San Juan Capistrano, CA). The 
amounts of hGH in the culture media were determined by assaying the 
amounts of absorbed ^^^Mabeled anti-hOH antibody in the sandwich 
assays. 

Tht results of the transfection assays were normalized for 
transfection efficienqr using pXGH5. The amount of hGH produced by 
pXGH5 transfected cells was arbitrary assigned a value of 1, and then the 
relative amount hGH produced by e^ch cell type transfected v/ith the test 
plasmid was calculated as a ratio to that obtained with this control 
plasmid. In order to determine the promoter activity of various DNA 
constructs in dissimilar cell types transfected by different methods, a 
comparison of the relative amounts of hGH in the culture media for each 
cell type is preferable to a comparison of the absolute amounts of hGH 
(Sand ei ai, J. BioL Chem. 264:1022-1026 (1989)). It has been reported 
in other studies (Selden el qL MoL Cell BioL 6:3173-3179 (1986); Sarid 
ei aL, J, Biol Chem, 264:1022-1026 (1989); Selden el oL, Science 236:714- 
718 (1987)) that the amount of groMh hormone produced by cells 
transfected with different hGH constructs is related to the amount of 
hGH mkNA in the transfected cells. To confirm that the variation in the 
amount of hGH in the culture media of cells transfected with the different 
constructs reflected a change in the level of hGH mRNA in the cells, total 
RNA was isolated from transfected rat basophilic leukemia- 1 cells and 
rat-1 fibroblasts. Blots containing total RNA (10 ^g/sample) were then 
prepared and probed with a "^^P-labeled 950-bp Bglll/EcoRl fragment of 
the hGH cDNA present in p<f>GH. 
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4. DNA/Protein Binding Analyses — Nuclear extracts were 
prepared from rat basophilic leukemia-1 cells and rat-l fibroblasts by a 
modification of the procedure described by Dignam and coworkers 
(Dfgnam et aL, Nucleic Acids Res. J/:1475-1489 (1983)). Each preparation 
5 of pelleted cells (10^) was washed once with i5 ml of ice-cold 10 mM 
Hepes (pH 7J9), 1^ mM MgC^, 10 mM KCI, 1.0 mM dithiothreitol 
(DTI), 0.5 mM phenybnethylsulfonyl fluoride (PMSF), 0.1% leupeptin, 
0.1% pepstatin, and 0.1% aprotonin. After a lO-min incubation at 4**C 
in the same buffer, the cells were centrifiiged for 3 min at 500 x g. The 
10 pelleted cells were resuspended in 1.0 ml of ice-cold buffer, lysed in a 
Dounce homogenfeer, and centrifiiged at 4**C for 10 min at 900 and 
then for 20 min at 16,000 x g. The supematants were aspirated and the 
pelleted nuclear proteins were resuspended in 3 ml of 20 mM Hepes (pH 
7.9), 25% glycerol, 1.5 mM MgCl2, 0.42 M NaCI. 0.2 mM EDTA, 0.5 mM 
15 DTT, 0.5 mM PMSF, 0.2% NP-40, 0.1% leupeptin, 0.1% pepstatin, and 
0.1% aprotonin. The pellets were homogenized again in a Dounce 
homogenizer, agitated gently for 3 min, and centrifiiged at 4*C for 30 min 
at 100^000 xg. The solubilized nuclear proteins in each supernatant were 
dialyzed at 4^*0 for 5 h against a 50-fold excess volume of 20 mM Hepes 
20 (pU 7.9), 20% glycerol, 0.1 M KCI, 0.2 mM EDTA, 0.5 mM DTT, 0.5 
mM PMSF, 0.1% leupeptin, 0.1% pepstatin, and 0.1% aprotonin, and 
then stored at -80**C in this buffer. The protein concentration of each 
nudear extract was determined by the Bradford method (Bradford, M.M., 
AnaL Biochem. 72i48-254 (1976)) using a Bio-Rad (Richmond, CA) 
25 protein assay kit with bovine serum albumin as standard. 

DNA^rotein-binding experiments were carried out in 20 ^1 of 10 
mM Tris buffer (pH 7.5) containing 4% glycerol, 1 mM EDTA, 0.5 mM 
DTT, 0.1 M NaCl, 4 fig of carrier poly(dl-dC) (Stratagene, La Jolla, CA), 
and 1 ng of a double-stranded -^"P-end-labeled DNA probe corresponded 
30 to residues -250 to -161, residues -118 lo -81, or residues -40 to -t-24 of 
the mouse serglycin gene (A\Taham ei aL, /. BioL Chem. 264:16719-16726 
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(1989)). In the binding competition assays, 5 ng of the specific 
serglydn-denved unlabeled oligonucleotide, 100 ng of unlabeled sonicated 
^bnon speim DNA, or 100 ng of a 64-mer unlabeled double-stranded 
oligonucleotide that binds transcription factors NFl/CTF, SPl, API, and 
5 APS (Stratagene Catalog No. 203001) was added to each reaction. After 
incubation at IS^'C for 30 min, gel mobility shift analyses were performed 
to detect the presence of specific DNA-binding proteins in the nuclear 
extracts. Samples were loaded onto a 5% non-denaturing 
polyacrylamide/bisacrylamide gel (30:1, w/w) that had been equilibrated 
0 before use by treatment for 1 h at 100 m A. The gels were run at 100 m A 
at 4^C until the bromophenol blue tracking dye ran approximately 
two-thirds the length of the gel. The gels were then dried under vacuum 
and autoradiographed generally for 16 to 24 h. 

Two control mixing experiments were used to confirm that the 
5 89-bp probe corresponding to residues -250 to -161 of the mouse serglycin 
gene bound to distinct trans-acting factors in nuclear extracts of rat 
basophilic leukemia-1 cells and rat-1 fibroblasts. In the first experiment, 
5 X 10^ rat basophilic leukemia-1 cells and 5 x 10^ rat-1 fibroblasts were 
mixed together, and then a nuclear extract of the pooled cells was 
) prepared and analyzed in the gel mobility shift assay. In the second 
experiment nuclear extracts were prepared separately from rat basophilic 
Ieukemia-1 cells and rat-1 fibroblasts and were mixed in 1:1, 2:1, or 3:1 
proportions just before being analyzed in the gel mobility shift assay, 

B. Identification by Deletion Analysis of 
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Suppressors and Enhancers in the 5' Flanking 
Rejgion of the Mouse Serglvcin Gene 

To determine if the proximal 5' flanking region of the mouse 
seigtydn gene contains cis acting regulatory elements, a DNA fragment 
that extends 504 bp upstream and 24 bp downstream of the gene's 
transcription-initiation site was linked to p^GH. Preliminary experiments 
were performed (analyzing the kinetics of secretion and the stability of 
hGE£) to optimfee the transfection assay in rat basophilic Ieukemia-1 cells, 
WEHI-3 cells, rat fibroblasts, and mouse fibroblasts. No hGH was 
detected in any cell pellet, and therefore apparently all translated hGH 
was secreted. Additional experiments revealed that hGH was not 
degraded following its secretion into the culture media. 

Rat basophilic Ieukemia-1 cells, mouse WEHI-3 cells, rat-1 
fibroblasts, and mouse 3T3 fibroblasts were transiently transfected with 
the resulting plasmid construct, designated pPG(-504/+24)hGH- TTie 
results of the transfection experiments were normalised for transfection 
efficiency relative to that obtained with the reference plasmid pXGHS. 
As additional controls, cells were transfected with pSV40-hGH and 
pTKGH. As shown in Table Til, the relative amount of hGH present in 
the 4-d conditioned medium of transfected mouse WEHI-3 cells and rat 
basophilic Ieukemia-1 cells was 20- to 18-fold higher than for transfected 
fibroblasts of the respective species. Therefore, the 504-bp region 
inmiediately upstream of the transcription-initiation site of the mouse 
serglydn gene contains c£y-acting elements that preferentially enhance 
transcription of this gene in hematopoietic cells. 
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TA»Ui HI 

Relative human growth homusnc (hGH) produdkm by four cell lines thai have been transentiy iransTectcd with 
oantiol {dasmidls and a plasmid that contains the S' flanking region of the mouse serglycin gene fused lo a 
pnsDOteiless human growth bonnonc gene. 

5 Hasmid Relative Expression of hG If* Ratio 



Oonstiud 


Mouse 
nb. 


Mouse 
WEH(-3 


RaM 
Fib. 


rat basofrfiilic 
Icukemia-l 
cells 


WEHI-3/ 
Mouse Fib. 


rat 
basophBic 
leukemia-1/ 
Rat Fib. 


pXGHS 


1.0 


1.0 


1.0 


1.0 


1.0 


1.0 


pSV40-bGH 


ND 


ND 


0.29 + 0.06 


0.47 ± 0.16 


ND 


1.6 


pTKGH 


030 ± 0.05 


0.11 ± 0.04 


NO 


0.72 ± 0.2) 


0.3 


ND 


pPG(-504/+24)hGII 


0.0) ± 0.02 


a20 ± aos 


a04 ± 0.01 


0.72 ± ai2 


20 


18 



Fib^ fibroblasts; ND, not determined: rat basof^ilic lcukeniia-l, rat basophilic leukemia-1 cells: and WHHI-3. mouse 
inyelomQnoqtic cells. 

Results are expressed as the mean ± SD of 5 to 6 experiments of 4.d duration, with each experiment performed 
on 2 locate dishes of cells. 



15 To locate more precisely these ctsr-acting elements, 9 additional 

plasmid constructs were prepared that had progressive deletions of the 5' 
flanking region of this mouse gene fused to the hGH gene in p<^GH, as 
shown in Figure 8A. Rat basophihc leukemia-1 cells and rat-l fibroblasts 
transfected with constructs pPG(^23/+24)hGH, pPG(-333/+24)hGH, and 

20 pPG(-250/+24)hGH produced amounts of hGH comparable to the 
corresponding cells transfected with pXGH5. The production of hGH 
was enhanced -2.5-fold when rat basophilic leukemia-1 cells were 
transfected with construct pPG(-190/+24) hGH; production of hGH was 
also enhanced when rat basophilic leukemia-1 cells were transfected with 

25 the pPG(-118/+24)hGH construct. When rat-l fibroblasts were 
transfected with constructs pPG(-190/+24)hGH and pPG(-118/-f 24)hGH. 
production of hGH increased 21-fold and 24- fold, respectively. Therefore, 
at least one cZs-acting element resides between -250 and -190 that 
suppresses transcription of the serglycin gene in cells, and this negative 

30 element is dominantly active in fibroblasts. Although rat basophilic 
leukemia-1 cells and rat-l fibroblasts transfected with constructs 
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pPG(-81/+24)hGH, pPG(-63/+24)hGH, andpPG(-40/-l-24)hGH produced 
some hGH, flie amount was substantially less than that produced by cells 
transfected with construct pPG(-118/+24)hGH. Thus, at least one ds- 
acting element in the nucleotide sequence -118 to -81 constitutively 
5 enhances transcription of the serglycin gene in rat basophilic leukemia-1 
cells and fibroblasts. When normalized for the efBdency of transfection, 
rat basophilic Ieukemia-1 cells transfected with construct 
pPG(-118/+24)hGH produced 2.7-fold (p <0.05) more hGH than 
simflarly transfected fibroblasts, indicating that the enhancer between 

10 residues -118 and -81 is more dominantly active in rat basophilic 
Ietikemia-1 cells than in fibroblasts. Because no hGH was produced by 
rat basophilic !eukemia-l cells or rat-1 fibroblasts transfected with 
construct pPG(-20/+24)hGH, the proximal element in the promoter 
region of diis gene must reside between -40 and -20. 

15 As assessed by RNA blot analysis (Fig. 8B)^ rat basophilic 

Ieukemia-1 cells contained abundant amounts of hGH mRNA wdien 
transfected with pPG(-504/+24)hGH, pPG(-118/+24)hGH, pSV40-hGH, 
or pXGHS. A lesser amount of hGH mRNA was present in rat 
basophilic leukemia-1 cells transfected with pPG(-40/-h24)hGH, and no 

20 hGH mRNA was detected in cells transfected with p^GH. Rat-1 
fibroblasts contained abundant amounts of hGH mRNA when transfected 
with pPG(-118/+24) hGH, pSV40-hGH, or pXGH5, but the amount was 
below detection when transfected with pPG(-504/+24)hGH or p<^GH 
(Fig. 8B). The level of hGH mRNA in rat-l fibroblasts transfected with 

25 pPG(-40/+24)hGH was less than replicate fibroblasts transfected with 
pPG(-I18/+24)hGH. 

To determine if the positive c£r-acting element in residues -118 to 
-44 of the mouse serglycin gene functions as an enhancer and to confirm 
that the negative ciy-acting element resides upstream of the enhancer, two 

30 5' flanking regions of the gene were ligated into pSV40-hGH to create 
constructs pPG{-250/-44)SV40-hGH and pPG(-118/-44)SV40-hGH (Fig. 
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9). Rat basophilic leukemia-1 cells and rat-1 fibroblasts were then 
transfected with pSV40-hGH or with one of the two plasmid constructs, 
and the relative levels of hGH in the culture media were determined. 
Greater than 3-foId more hGH was detected in the culture medium of rat 
5 basophilic leukemia-I cells and rat-1 fibroblasts transfected with 
pPG(.118/-44)SV40-hGH relative topSV40-hGH, indicating the enhancer 
in this 5' flanking region of the gene functions as an enhancer. This 
regulatory element also induced rat basophilic leukemia-1 cells and rat-I 
fibroblast to produce —3.6-fold more hGH when linked in the plasmid in 

10 its opposite orientation 2.6 kb upstream of the SV40 early promoter (Fig. 
9). The finding that the level of hGH produced by fibroblasts transfected 
with pPG(-250/-44)SV40-hGH is approximately one-half that of fibroblasts 
transfected with pPG(-118/-44)SV40-hGH again indicates that there is a 
negative as-acting element within the more distal 5' flanking region of the 

IS mouse serglycin gene that is active in fibroblasts. 

C. Identification of the Proximal Region of the Promoter of the 
Mouse Serglycin Gene by Site-Directed Mutagenesis — Although no 
classical TATA box is present -30 bp 5' of the transcription-initiation 
site of the mouse or human serglycin gene, a ACCTCT TTCTAAA AGGG 
[SEQ ID No. 5J sequence is present beginning 22 nucleotides upstream 
of the transcription-initiation site. Site-directed mutagenesis was 
performed to determine whether or not this region was part of the 
proximal promoter of the gene. Constructs were prepared in which the 
adenosine at -28 of pPG(-504/+24)hGH was converted into a cytosine, the 
cytosine at residue -30 was converted into an adenosine, or the adenosine 
at -38 was converted into a guanosine (Table IV). Relative to cells 
transfected with pPG(-504/+24)hGH, substantially less hGH was 
produced by rat-1 fibroblasts and rat basophilic leukemia-1 cells 
transfected with any one of the three mutated constructs. The greatest 
inhibition occurred with the construct that had a mutated residue -30. 



20 



25 



30 



wo 93/13119 



PCT/US92/11194 



-70- 



TABUi IV 

Rda^ human gfxxsnh bonnonc (bGII) producUon by rat basophBic leukcmia-I cdls and rat-l 
fibrobla^ transfected viih constructs coniaming a normal and mutated pnxxtmal promoter regon 





of the mouse sex^fyda gene. 








5 


Kndeotide 


Mutation 


Relative Bcmesson of hGH* 




Sequence 


pQsalion 


rat basojASIc 
Icukemia-l cells 


Rat-1 Dbroblasts 




AOCiO'l'lClAAAAGGC 
(native) ^tEQ ID No. 5] 


. None 


IJOQ 


LOO 


10 


ACXix:rrrcro\AAGGG 

(mutated) [SEQ ID No. 6| 


-28 bp 


031 ±aoi 


038 ±0.06 




AOCrCTTTATAAAAGGG 
(mutated) [SEO ID No. 7] 


-30 bp 


0X9 ±Qm 


ai7 + 0.03 




GCXTCnTCTAAAAGGG 
(mutated) (SEQ ID No. 8) 


-38 bp 


a43 ± 0.02 


0.69 + 0.02 



15 * Results are expressed as the mean ± I/Z range oT two experiments with each experiment 

peifbimed on 8 rcpGcate dishes of cells. 



Protein/DNA binding Analyses - Gel mobili^ shift assays were 
used to determine whether or not rat basophilic leukemia- 1 cells and rat-1 
fibroblasts contain mms-acting factors in their nuclei that bind specifically 

20 to the three identified c^-acting regulatoiy elements in the 5' flanking 
region of the mouse serglydn gene. An 89-bp ^^P-labeled DNA Augment 
containing die putative suppressor and corresponding to residues -250 to 
-161 of die mouse serglycin gene was gel electrophoresed before and after 
it had been incubated wth the nuclear extracts from rat basophilic 

25 Ieukemia-1 cells and rat-1 fibroblasts. As shown in Figure 10 for one of 
four CTperiments, in the absence of nuclear extract the radioactive probe 
migrated to its expected position at the bottom of the gel (lane 1). When 
the^P-Iabeled probe was incubated with the nuclear extracts from either 
one of the two populations of cells before electrophoresis, it was 

30 selectively retained in the gel by a putative DNA-binding protein 
(designated B/Fj.250/_i6i)"0 (lanes 2 and 5). The binding of this 
^P-labeled oligonucleotide to B/Fj.25o/-i6i)'^ specific because 

retention of the probe was diminished when a 5-fold excess of the same 
nonradioactive DNA fragment was included in the assay (lanes 3 and 6): 
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retention was not diminished when a 100-fold excess of sonicated salmon 
spenn DNA (lanes 4 and 7) or the 64-mer oligonucleotide that binds 
NFl/CTF, SPl, API, and APS was included in the assay. Based on its 
differentia] mobility in this gel mobility shift assay, a second mzn^-acting 
5 factor (F(,250A161)"^0 detected only in the nuclear extracts of rat-1 
fibroblasts. Because rat basophilic leukemia-1 cells, but not fibroblasts, 
contain proteases in their secretory granules (one of which is a chymase 
active at neutral pH (Seldin et qL, Proc, Natl Acad. ScL USA 82:381U3S75 
(1985))), two control experiments were carried out to determine if the 

10 absence of F(.250Ai6ir''^ basophilic leukemia-1 celk was a 

consequence of the isolation procedure used to obtain the nuclear DNA- 
binding proteins. When the nuclear extracts from rat basophilic leukemia- 
1 cells and fibroblasts were mixed in different proportions before analysis, 
the level of F(-250A161)"'^ detected in the gel mobility shift assay varied 

15 according to the amount of fibroblast-derived nuclear protein that was 
used in the assay. Furthermore, the amounts of B/F(-250/-J6l)"J 
Fj.250/-I6l)''^ detected in the nuclear extract of a pooled preparation of rat 
basophilic leukemia-1 cells and fibroblasts were compatible with the 
results of the extracts of the individual cell types. Thus, the absence of 

^ ^(-250A161)"^^ nuclear extracts of rat basophilic leukemia-1 cells vms 

not a consequence of preferential proteolysis of this /m/ir-acting factor in 
rat basophilic leukemia-1 cells. 

A 37-bp ^^P-labeled DNA fragment containing the putative 
enhancer element and corresponding to residues -118 to -81 of the mouse 

25 serglycin gene vras gel electrophoresed before and after it had been 
incubated with the nuclear extracts from rat basophilic ieukemia-1 cells 
and rat-1 fibroblasts. As shown in Figure 11 for one of 3 experiments, a 
retarded species (B(.|,x,4^,j-I) was obtained when the -^^P-labeled 
oligonucleotide was incubated with rat basophilic leukemia-1 cell-derived 

30 nuclear extracts (lane 2). Bj_,|j^.j^,^-I possessed a mobility different from 
that of the retarded species, Fj_j,j^;,j.]j-1, present in the nuclear extracts of 
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rat-l fibroblasts (lane 5). The ability to inhibit binding of the ^P-labeled 
oHgonucIeotide to B^.^g^ij-I and to F^.^ig^r^^j-I with a 5-fold excess of 
the same nonradioactive DNA probe (lanes 3 and 6), but not with a 
100-fold excess of sonicated salmon sperm DNA (lanes 4 and 7) or the 
5 64-mer oUgonudeotide that binds NFl/CTF, SPl, API, and AP3 (data not 
shown), indicated that these interactions were specific. Although an 
additional retarded species was observed in this gel mobility shift assay 
when either nuclear extract was used, its binding could not be inhibited 
by an excess of the specific nonradioactive oligonucleotide. 

10 A (54-bp -^^P-labeled DNA fragment containing the putative 

proxnnal promoter element and corresponding to residues -40 to +24 of 
the mouse serglydn gene was also gel electrophoresed before and after it 
had been incubated with the nudear extracts from rat basophiUc 
leukeraia-I cells and rat-1 fibroblasts. As shown in Rgure 12 for one of 

15 3 experiments, when the ^^P-labeled probe was incubated with the nuclear 
retracts firom either one of the two populations of cells before 
electrophoresis, a newspedes, designated B/F^^;^.24)-I, was observed that 
migrated more slowly in the gel (lanes 2 and 7). The binding of this 
%-labeled oligonucleotide to B/F^^j^j^yl could be competitively 

20 inhibited by a 5-fold excess of the same nonradioactive DNA probe (lanes 
3 and 8), but not by a 100-fold excess of sonicated salmon sperm DNA or 
the oligonudeotide that binds NFl/CTF, SPl, API, and AP3. Additional 
distinct mmr-acting factors, designated '^(-40/+24)"^^' "^^^^ 

detected in the nuclear extracts of rat-l fibroblasts and rat basophilic 

25 Ieukemia-1 c^lls, respectively. The binding of this 64-bp serglycin 
proteoglycan -derived ^^P-labeled probe to Fj_4q/^24)"^^ B/P(-40/+24r^» 
^(-40/+24)'^ was minimally diminished in the competition assay when 
nonradioactive DNA that had a mutated residue -28, -30, or -38 was used 
in a 50-fold excess over the nonmutated -^-P-labeled probe (Fig. 12). 
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Example 10 
Methvlation of the Human Serglvcin Gene 



TTie serglydn gene-derived Alu sequences were aligned with the 
AIu consensus nucleotide sequence of Jurka, l.etaL, Proc NaiL Acad ScL 
5 USA 85:4T75AT78 (1988); their locations and characteristics are depicted 
in Figure 13A and Table V. 



Table V 

Distribution and Type otAlu Elements in the Human Serglycin Gene 

Twenty-one y4/w elements were detected in the nucleotide seuqence 
10 of the human serglycin gene (Figure 13A). Of these. 10 were identified 
in the introns. Thirteen were of the S type, and 8 were of the J type. In 
two instances, only approxinately one-half of an Alu element was inserted 
in the gene. These elements were oriented in the sense (F) or anti-sense 
(R) direction relative to ther est of the human serglycin gene. 



Number Location 


Type 


D 


1 


5'-flanking region 


S 


R 


2 


5'-flanking region 


S 


F 


3 


Intron 1 


S 


F 


4 


Intron 1 


S 


F 


5 


Intron 1 


S 


F 


6 


Intron 1 


S 


F 


7 


Intron 1 


J 


R 


8 


Intron 1 


S 


F 


9 


Intron 1 


S 


F 


10 


Intron 1 


J 


F 


11 


Intron 2 


J 


R 


12 


Intron 2 


S 


R 


13 


Intron 2 


J(l/2) 


F 


14 


Intron 2 


J 


F 


15 


Intron 2 


J 


R 


16 


Intron 2 


J(I/2) 


F 


17 


Intron 2 


S 


F 
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18 Intron 2 

19 Intron 2 

20 Intron 2 

21 Intron 2 



S R 

J F 

S R 

S F 



Fifteen diagnostic positions were examined to detenninne if the 
Abi dements were of the "S" or the "J" subfamily. Because the J 
subfemily of Alu elements is more similar to 7SL RNA than the S 
subfamily, the J type probably is a more primitive Alu element Eleven 

10 of tbe^lZa elements were of the S type, whereas eigith were of the J type. 
Tie two Alu elements in the 5'flanking region were of the S type. The 
AJu elements were present in both orientations. Thirteen were oriented 
in flie sense direction of the gene*s exons; whereas the other six were 
orientated in the anti-sense direction. 

15 Tlie deduced nucleotide sequence of the human sergfydn gene was 

used to determine the methylation pattern of this gene in cells that do 
and do not transcribe it Because of their hybridization to corresponding 
regions within other genes, it was not possible to probe genomic DNA 
blots with short DNA fragments of tlie human serglydn gene that 

20 contained Alu DNA sequences. Thus, knowledge of the exact location of 
the Alu repetitive elements within the serglycin gene (Figure 13A) 
permitted the avoidance of those sequences in the methylation study. 
Because HL-60 cells, but not Molt-4 cells, contain serglycin mRNA, DNA 
was isolated from these two cell types and the methylation patterns of 

25 their seglydn genes determined. The location of all of the sites 
susceptible to Hpall and Mspl within the human serglydn gene were 
detennmed (Figure 13B) and PCT methodology was used to construct 13 
probes (designated A-M) to determine how many of these ^-CCGG-S' 
sequences contained an internal 5* methylcytosine. 



Human genomic DNA was prepared as described by Sambrook ei 
aL, Molecular Cloning: A Laboratory Manual, 2nd ed., pp. 9.16-9,19 (1989), 
Cold Spring Harbor Laboratoiy, Cold Spring Harbor, NY. HL-60 ceUs 
and Molt-4 cells were each centrifuged at 1500 x g for 10 min at 4**C 
Hie supematants wer removed, and the cells were washed twice with ice- 
cold 0.14M NaCl, 2.7 mM KCl, 25 mM Tris. pH 7.4. The cells were 
suspended in 10 mM Tris, pH 8.0, containing 0.1 M EDTA, 0J% SDS, 
and 20 ^gfml pancreatic RNase and were incubated for 1 hour at 37^C. 
After Proteinase K (100 fig/m\) treatment under standard conditions, the 
digests were extracted 4 times with Tris-saturated phenol and then with 
chloroform. The genomic DNAs in the resulting solutions were 
precipitated with ethanol. Samples containing about 10 fig jof DNA 
either were dissolced in 10 mM MgC^ and 20 mM Tris, pH 7.4, and 
digested with Hpa II (GIBCO-BRL), or were dissohred in 10 mM MgCl2 
and 50 mM Tris, pH8.0, and digested with A/.r^?! (GIBCOBRL). These 
two restriction enzymes both cleave the unmethylated nudeotdie seuqence 
5'-OCGG-3', but only Mspl cleaves this sequence if the internal C is 5 
methylqrtosine (Waahvijk. C, and Flavell, R.A., Nucleic Acids Res: 5:3231' 
3236 (1976)). The digests were electrophoresed in 1% agarose gels and 
transferred (Southern, E.M.. / MoL Bioi 95:503-517 (1975)) to Duralon 
membranes (Stratagene). The resulting DNA blots were hybridized with 
random-primed, PCR-derived, 250-880-bp probes that correspond to 13 
different regions of the human serglycln gene. These DNA probes were 
generated with either the HL-60 cell-derived serglycin cDNA (Stevens 
etoL; J. BioL Chem, 263 J2&7-129\ (1988)) or the serglycin genomic 
subclones (Nicodemus ei aL, / BioL Chem. 2(55:5889-5896 (1990)) as 
templates. The probes used in this methylation study were specific to the 
human serglydn gene because each hybridized to a single genomic 
fragment 

When blots containing digested genomic DNA from HL-60 cells 
were analyzed with the iniron 1 probes A-F. each probe hybridized to a 
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DNA fragment in the Mspl digest thai was identical in size with the 
corresponding fragment in the Hpall digest Therefore, the 5'-flanking 
region and intron 1 of the serglycin gene were both hypomethylated in 
HL-80 cells. Like the intron 1 probes, the intron 2 probe J hybridized to 
5 identically sized fragments in the Hpall and Mspl digests of genomic 
DNA from HL-60 cells. In contrast, several of the other sites in intron 
2 of the serglycin gene were at least partially methylated in HL-60 celis. 
Whidi of the five HpalllMspl sites at the 3' end of this gene are 
methylated could not be conclusK'ely determined because of their 

10 proxinuty to one another, but probes K and M both hybridized to larger 
DNA fragments in the Hpall digest, as compared with the Mspl digest 
Thus, some, if not all, of these sites at the 3' end of the serglycin gene are 
methylated in HL-60 cells. Whereas probes G, H, and I all hybridized to 
the e!q)ected size of DNA fragments after digestion of genomic DNA with 

15 Mspl^ they hybridized to two to three fragments after digestion with 
HpalL Hie presence of a 3.2-kb DNA fragment that hybridizes to both 
probe H and probe I argues that the Hpall sites that reside at 11.5 and 
1L8 kb in the serglycin gene are methylated in most HL-60 cells. In a 
second ©qperiment. these two sites were methylated in almost all of the 

20 HL-60 cells in the culture. Exon 2 probe G hybridized to two 
approximately equal fragments in the f/pall-digest, indicating that the 
HpalllMspl site at 95 kb in the serglycin gene was methylated in 
approximately 50% of the HL-60 in the culture. These findings are not 
the result of incomplete digestion of the DNA samples of nonspedfic 

25 hybridization of the probes vinth a fragment from another gene because 
the same blot yielded single bands after hybridization with other probes 
and because single bands were detected when Mspl-digested genomic 
DNAs were analyzed with these same serglycin-derived probes. 

In contrast to the gene in HL-60 cells, the serglycin gene in Molt-4 

30 cells was highly methylated. Probing with any of the PCR-derived DNA 
fragments yielded DNA genomic fragments of >10 kb, indicating that 
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most, if not all, of the Hpall sites in the serglycin gene of Molt-4 cells 
were methylated. 

Several genes have been reported preferentially to contain as- 
acting regulatory elements in their first tntrons. Although it has not been 
S determined if the transcription*regulatory activities of any of these 
elements in intron-1 are effected by methylation, it has been shown that 
CpG methylation of the cAMP response element found in the promoters 
of many genes abolishes its transcriptional regulatory activity (Iguchi- 
Ariga, S.M.M. el aL, Genes & Dev. 3:612-619 (1989)). The diminished 

10 methylation of the first intron of the serglycin gene in a cell that contains 
abundant levels of this transcript, but not in a cell that does not transcribe 
the gene, suggests that specific methyiation-dependent nucleotide 
sequences in intron 1 act in concert with the identified sequences in the 
5'-flanking region (Avraham, S. ei al., / BioL Chem. 267:610-617 (1992)) 

15 to regulate transcription of the serglydn gene in different cell types. 

All references cited herein are incorporated herein fully by 
reference. Having now fully described the invention, it will be understood 
by those with skill in the art that the scope may be performed within a 
wide and equivalent range of conditions, parameters and the like, without 

20 affecting the spirit or scope of the invention or any embodiment thereof. 
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WHAT IS CLAIMED IS: 

L An isolated DNA sequence consisting essentially of the 5' 
regulafoiy region of the human or mouse serglycin gene, said regulatory 
region comprising: 

5 a. the promoter element located between nudeotides - 

40 and -20 of said serglycin gene; 
b. the negauve transcriptional regulatory element 
located between nucleotides -250 and -190 of said 
serglycin gene; and 

10 a the positive transcriptional regulatory element 

located between nucleotides -118 and -81 of said 
sergfycin gene, 

2. TTie sequence of claim 1, wherein said serglycin gene is the 
human gene. 

15 3- The sequence of claim 2, wherein said sequence consists 

essentially of bases -250 through -1 of Figure 4A. 

4. The sequence of claim h wherein said serglydn gene is the 
mouse gene. 

5- The sequence of claim 4, wherein said sequence consists 
20 essentiaUy of bases -250 through -I of Figure 5. 

6. An isolated DNA sequence consisting essentially of the 
negative transcriptional regulatory element located between nucleotides - 
250 and -190 of the serglydn gene, and such element being dominant^ 
active to inhibit transcription of operably linked genes in fibroblast hosts. 
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7. The sequence of claim 6, wherein such element is from the 
mouse serglycin gene. 

8. TTie sequence of claim 7, wherein sudi element consists 
essentially of bases -250 through -190 of Figure 5. 

5 9* The sequence of claim 6, wherein such element is from the 

human serglycin gene. 

10. The sequence of claim 9, wherein such element consists 
essentially of bases -250 through -190 of Figure 4A. 

11. An isolated DNA sequence consisting essentially of the 
10 enhancer transcriptional regulatory element located between nucleotides - 

118 and -81 of the serglycin gene, and such element being domtnantly 
active to stimulate transcription of operably linked genes in hematopoietic 
cells. 

12. TTie sequence of claim 1 1, wherein such element is from the 

■» 

15 mouse hematopoietic serglycin gene. 

13. The sequence of claim 12, wherein such element consists 
essentially of bases -118 through -81 of Figure 5. 

14. The sequence of claim 11. wherein such element is from the 
human serglycin gene. 

20 15. The sequence of claim 14, wherein such element consists 

essentially of bases -118 through -81 of Figure 4A. 



wo 93/13119 



PCTAJ»«/11194 



-so- 
le. An isolated DNA sequence consisting essentially of the 
eukaiyotic promoter element located between nucleotides -40 and -20 of 
the sergtydn gene, and such element being dominantly active for the 
promotion of transciption inoperably linked genes in hematopoietic cells. 

5 17- The sequence of claim 16, wherein such element is from the 

mouse serg^n gene. 

18. The sequence of claim 17, wherein such element consists 
essentially of bases -40 through -20 of Rgure 5. 

19. The sequence of claim 16, wherein such element is from the 
10 human seiglycin gene. 

20. The sequence of claim 19, wherein such element consists 
essentially of bases -40 through -20 of Figure 4A- 

21. A DNA expression vector comprising the DNA sequence of 
any of claims 1-20. 

15 22, The expression vector of claim 21, wherein said vector is a 

plasmid. 

23. The expression vector of claim 22, wherein said vector is an 
K coZ^mammalian cell shuttle vector. 

24. The expression vector of any of claim 21, wherein said DNA 
20 sequence is operably linked to a gene of interest- 

25. A host cell transformed wth the DNA expression vector of 
claim 21. 
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26. A host cell transformed with the DNA expression vector of 
claim 24. 

27. Hie host cell of claim 24, wherein said host is a 
hematopoietic cell. 

28. The host cell of claim 24, wherein said host is a fibroblast 

29. A method for producing a gene of interest, said method 
comprising: 

(a) transforming a host cell with the expression vector of claim 
24, wherein said gene of interest is operably linked to said 
DNA sequence and heterologous to said DNA sequence; 

(b) expressing said gene of interest in said host cell; and 

(c) collecting said gene of interest. 



30. A method for inhibiting the production of a gene, said 
method comprising: 

1^ (a) transforming a host cell with the expression vector of claim 

24, wherein said gene of interest is operably linked to said 
DNA sequence and wherein said gene of interest encodes 
an antisense RNA complementary to said gene whose 
production it is desired to inhibit; and 

20 (b) expressing said antisense RNA in said host cell. 

31. A cell-free composition comprising B/F^,250Ai6ir^- 

32. A cell-free composition comprising ^/P^.250/-\6\)'^^' 
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33- A cell-free composiuon comprising Bj^^^g^ij-L 

34. A cell-free composition comprising F^.^g^^j-L 

35. A cell-free composition comprising B/F^^/^2Ay^ 

36. A cell-free composition comprising F(-4o/+24)"n- 
i 37. A cell-free composition comprising ^^^/^24)'^^ 
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CDNA-H19 
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CDNA-H12 
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1 GTGCAGCTGGGAGAGCTAGACTAAGTTGGTC AT6 ATG CAG AAG CTA CTC AAA TGC 

HX MH0KLLKC8 

56 ACT CGG CTT GTC CTG GCT CTT GCC CTC ATC CTG GTT CTG GAA TCC TCA 
S R L|V L A L A L I L V L E S S 24 

104 GH CAA GGT TAT OCT ACG CAG A6A GCC AGG TAC CAA TGG 6TG CGC TGC 

VQGYPTQRARYQWV RC 40 

15£ AAT CCA GAC ACT AAT TCT GCA AAC TGC CTT GAA GAA AAA GGA CCA ATG 

N P D S N S A N C L E E K G P M 56 

XnnI 

200 nC GAA CTA CTT CCA GGT GAA TCC AAC AAG ATC CCC CGT CTG AGG ACT 

F E L L P G E S N K I P R L R T 72 

248 GAC CTT TH CCA AAG ACG AGA ATC CAG GAC m AAT CGT ATC nC CCA 

D L F P K T R I Q D L N R I F P 88 



296 CTT TCT GAG GAC TAC TCT GGA TCA GGC TTC GGC TCC G6C TCC GGC TCT 

LSEDYSG S6FGSGS GS 104 

344 GGA TCA GGA TCT GGG AGT GGC TTC CTA ACG GAA ATG GAA CAG GAT TAC 

GSGSGSGFLTEHEQDY 120 

AccI 

392 CAA CTA GTA GAC GAA AGT GAT GCT nC CAT GAC AAC CTT AGG TCT CTT 

Q 'L V D E S D A F H D N L R S L 136 

440 GAC AGG AAT CTG CCC TCA GAC AGC CAG GAC HG GGT CAA CAT GGA HA 

D R N I. P S D S Q D L G Q H G L 152 

488 GAA GAG GAT TH ATG TTA TAA AAGAGGAHHCCCACCTTGACACCAGGCAATGTA 

E E D F M L m 158 

544 GnAGCATAnTTATGTACCATGGTTATATGAHAATCTTGGGACAAAGAATnTATAGAAAT 
607 TTnAAACAn:TGAAAAAGAAGCTTAAGnnATCATCCTnTTTTT(T)CTCAT 



FIG.2 

SUBSTITUTE SfCET 
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SUBSTITUTE SHEET 
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-6H1 GimiKTCTCCAGCCTHiGTGAC/mGAGACTlXATCTCAAAAAM^^ 

-557 AAAAAtWyWAAGAAGAAGAAGAAACTmCATCTGAM^ 

-479 nTGAAGmCACnCACCAKnGGCTCWrrGAGGTATGnACTimGCTa^^ 

-401 atcttgaaagtgcttggtgacgaaaaggcagcacctagatcccttatctcataaaaaa™ 

-323 AGCAATCTAinATnAGAnGnArcTGAAGAAAGGAAAAACAAACTGTimAATGCTGAn^^ 

-246 GAAAAAAAAATGTCTTGCAGGiMTGGimAACAAAACTnTGAAAAAGCAGGlXT^ 

-168 nCATAATGGGTATGAATAGnATnTACTGTGnilimAimmcmCTGGGm 

-89 CTAmGnCAGGAAATTGTGACGTGTGnCTGGGCAGGGTnGAGGnnGGAACATmCTAAAAGGGACAQ^ 



-exon 1 

-11 CACIXTGCTACATniXTAATIMiAAGnGGCGTGCAGCTGGGAGAGCTAGACTAAOT 



star-t 
ATG ATG 
Met M 
I 



CAG AAG CTA CTC AAA TGC AGT CGG CH GTC CTG GCT CTT GCC CTC ATC 
Gin Lys Leu Leu Lys Cys Ser Arg Leu Va( Leu Ala Leu Ala Leu He 

10 I 
exon 1 — H 
CTG GTT CTG GAA TCC TCA GH CAA Ggt 
Leu Val Leu Glu Ser Ser Val Gin (Gly) 
20 



aagactcaggagtcttg-ttccccagccQtcttc 



exon 2 

-( 8 kb)-tacttag-taacaatgtgggttcctcgggca gGT TAT CCT ACG CAG A6A GCC AGG 

(Gly) Try Pro Thr Gin Arg Ala Arg 
30 

TAC CAA TGG GTG CGC TGC AAT CCA GAC ACT AAT TCT GCA AAC TGC CTT 

Tyr Gin Trp Val Arg Cys Asn Pro Asp Ser Asn Ser Ala Asn Cys Leu 

40 

GAA GAA AAA 6GA CCA ATG HC GAA CTA CTT CCA GGT GAA TCC AAC AAG 

Glu Glu Lys Gly Pro M Phe Glu Leu Leu Pro Gly Glu Ser Asn Lys 

60 



FIG.4A 



SUBSTrrUTEStCET 
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exon E-J 

All CCC CGT CTG AGG ACT GAC CH TH CCg tQQgtggarttttrtctaattaQttaatt 
He Pro Arg Leu Ptq Thr Asp Leu Phe (Pro) 
70 



1^ 



exon 3 



-(-6 I<b)-tact9gtttttttcccatmtrtttcatacttc agA AAG ACG AGA ATC CAG GAC 

(Pro) Lys Thr Arg He Gin Asp 

80 



TTG AAT CGT ATC HC CCA CH TCT GAG GAC TAC 
Leu Asn Arg He Phe Pro Leu Ser Glu Asp Try 

90 



TCT GGA TCA 6GC HC 
Ser Gly Ser Gly Phe 



GGC TCC GGC TCC GGC TCT GGA TCA GGA. TCT GGG AGT GGC 
Gly Ser Gly Ser Gly Ser Gly Ser Gly Ser Gly Ser Gly 
100 110 



GAA ATG GAA CAG GAT TAC CAA CTA GTA GAC GAA AGT GAT GCT HC CAT 
Glu Het Glu Gin Asp Try Gin Leu Val Asp Glu Ser Asp Ala Phe His 

120 130 

GAC AAC CTT AGG TCT CTT GAC AGG AAT CTG CCC TCA GAC AGC CAG GAC 

Asp Asn Leu Arg Ser Leu Asp Arg Asn Leu Pro Ser A^ Ser Gin A^ 

140 

TTG GGT CAA CAT GGA HA GAA GAG GAT TTT ATG HA TAA AAGAGGAUnC 

Leu Gly Gin His Gly Leu Glu Glu Asp f^e Mei Leu slop 

150 160 

CCACCnGACACCAGGCAATGTAGHAGCATATTTTATGTACCATGGnATATGATTAATCnGGGACAAAGAATm 
ATAGAAATTTnAAACATCTGAAAAAGAAGCnAAGTTTTATCATCCTHTTTnCTCATGAAnCTTAAAGGAnAT 
GCTnAATGCTGHATCTATCnAHGncnGAAAATACCTGCATnTnGGTATCATGnCAACCAACATCATTAT 
GAAAnAAnAGAnCCCATGGCCATAAAATGGCTTTAAAGAATATATATATATTTnAAAGTAGCTTGAGAAGCAAA 
TTGGCAGGTAATATnCATACCTAAAnAAGACTCTGACTTGGATTGTGAATTATAATGATATGCCCCTnTCtTATA 
AAAACAAAAAAAAAATAAT 



nC CTA ACG 
Phe Leu Thr 
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GCCACTGCTCTCCAGIXTGGGTGACAGAGTGAGACTC 

-584 CATCT( MAAMA/\fl/\AA A AA / \M I W \ AM AGM(^ 

nCATCTGAAATlXGWIAACTCAncnGWWiCTTAGAKTCAGCTniM^ 
CAlMiCTTGCCTCAGTGAGCTATGnACTmGGTGAAAAAGAAAATG^^ 

-404 mATGTTG/y\AGrGCTTEGTGACGAAAAGG(:AGi:ACCTAGATCCCT7ATCT(:ATAAA^ 
ATKAGCAGATTCnAATATTAGCAATCTAGTATTTAGATTGTTACCTGAAGAAAGGAAA 
AACAAACTGTCCCAAArGCTGATTCTACTGTTTCGGTGGGAAAAAAAAATGTCTTGCAGG 
- 224 CAAGTGGCAAACAACAAAACTnTGAAAAAGCAGGCCTGGGGGGAGTIXAGTACAGrrTC 
ATAATOTATCAATAGTTATnTACTGTGTTimaACimTnCTnCTGGGTTn 
GATGTGGATGTCmCTAnTGnCAGGAAATTGTGACGTGTGnCTGGGCAGGGTnGA 

-44 CGTTTTGGAACATTnCTAAAAGGGACAGAGAGCACCCTGCTAC 



I ATTTCCTAATCAAGAA 

GTTGGCGTGCAGCTGGGAGAGCTAGACTAAGTTGGTCATGATGCACAAGCTAC^^ I EM 1 
I CAGTCGGCTTGTCCTGGCTCTTGinTCATCCTGGTTCTGGAATmAG TTC^ r- ^ 

GTAA 

137 GACTCAGGAGTCTTCTTCCCCAGCCATCTTCTCTGTAAGCCCTGTGGTCCAreCAAGTW 
nATAnCATmAAGBCATAGAATGTATAATATTGTGAGAAAGGAGGCAAAGAAGAAGG 
ATTTGGGGTCGCTGAACCCTnAATATGAGnCTGnAAGTnGGTACCAAGAAAAAnA 

317 AACTCTGTGGCGTGTGCAGTCnGTAAACTCTTACAATGAnGAAATCTGCTATTTTGGG 
ATGAAAATGTGAGGTTTATAAAnnAAAAGCTCAAAAAAGGAATCTAGAAAATGACTCC 
TGTGCCTGTTGCATGGAGGAGATGGCACCTnGACTGTTGGGGGGTGTCTGCCTACCCCT 

497 AAGTGTCTACATCAGIXCCAAGnnACTGIKTGTGACGGTGTCAnGnATTn^^ 
CTGGGAGAmATATlXCAAnGGGGTGAATCTGACTGTGTGTAnnCTnTCTTTn 
TnnnnAAAGATAAACTTGGncnACTGAAAACTCAATTATGGTTAGACATAGTTC 

677 ATGTAAAAIXTCTmnnAAAGAGAAGGCCAAATAAnTGGTATnGTGCTCnGCT 
CAGAGAAGCATCATATTCCGAAATATCTTCCTAGGTnATCTACCATnAGTGTTGTTTA 
GTCAGACTGAAACAACnAAAACCTGTAATGACTAAGACAATGAAAATGATAGGCnGTA 

857 AGAAAAATACAAnTGTTATTCnTGGCAAATAAGGAATCATGTCTAAATAAGACGGAGG 
TCATGGCnGATAGAGAGATGGCTGAAlXTATAGTAGAAAAACACTAGGnimAAAT 
GGTAAGGGAAATGTTGAGTCACAATGACACACATGTaTAGATnGTniHim^ 



FIG.4B 



SUBSTTTUTE SHEET 



wo 93/13119 



7/25 



PCT/US92/1I194 



1 , 037 CTmCGnGTCATGATCTTACTTlXGinGGAGATGAMTCTTACAGATGATC^ 
CAnCATmATGTTGGAAAmAT/W\AT(:ArmcnCTACTTATGCTAAT^^^ 
AAAGAGCAAGTAATGTTTCTGGAACGTTATTAAT7TATGTATTTTTAAAATATAAAACAT 
1,217 TGTCAAnGTAGGGAACAEGCTTCACTGGGATCnnAGGGAATATCnCAGCnCATGA 
AATAAncaGAATAGClMTGGTCTGACAAGATCGAGAGTAATGAGGCCCATACTnA 
GTACAGTCTTGAATG6CCAGATGGTGCTGGGCATACCCCAACCAGAGATATGTAAGTCTT 

1,397 TAH^TlMAmCCCAGAAACATGAAmimTAAGAnCAn 

GAATGAAAACAAAAACGTTCCnGTATAATATTCATTAGAAAGAAATGAAGAAGGCCGGG 
CATGGTGGCTCACGCCTGTAATCCCAGCACTnGAGAGGCCAAGGTAGGCAGATCATGAG 

1,577 GTCAGGAGmGAGAIXAGCCTGGmCATAGTGAAATCCCGTCTCTAIXAm 
AAAAAATrAGCCGGGCATGGTGGCACACAIXTGTCATmGCTACTeAGGAGGCTGAffi 
CAGGAGAATTGCTTGAACCTGGGAGGTGGAGGnGCAGTGAGCTGAGATTGCACCACT^ 

1,757 ACTACAGCCTAGGTGACAGTGCAAGACTCTCTCAGAAAGAAAGAAACAGAGAGAGAGAGA 
AAGAAAGGAAAGAAAGAAAGAGAAGGAAAGAAAATAAnCATCATGAAAHGTATAGAAT 
ACTAGCAmATGTCATGACCTCGTAGGTnAGCTCTnGnAGAAAACGAAACCATAGA 

1,937 AAGAGACAAGGGAGAAACTGACAAACTAGGGTCmCCGAAAAAAGGCTCTCAGTATCGG 
GCTCAAGGGCTTGTGlXCACATCTGAGCATGCAGGGAAATAGATGTCttCCACTG^ 
ACATGTGAGTGACTGCGGCACAAGGCTGTGATGTGAAGAinCATGACACCATnCCTCAC 

2,117 AlXTIBCGCAATlMATATGAniBCAACAmniXTGTCTTATAAAAGTm 
TCTAGimTTGGmGGCAGATGAAATCAACTAGGCTTnGGCTTGCTnTACTGAGCA 
TAHCAAAAIIAmCAGGTCACTATAIITGGmGCTCGGGTTGCCATAACAAAGTACCA 

2,297 CGGACTGAGTGGCnAAATAACAGAAATGTAinaTGACACTTCTGGAGATGGGACT 
AAGATCAAGGTGCTmGGCrmCGCAnCTGAGGIXTCTCTttnBGCTTATA^^ 
GGmnCTlKTTGTCTGCACATGGlXTnCCTIIATGCATlXtnGTCnAATAlB^ 

2.477 CncnAGAGGGTCACCAGGCAnGGATCAGGGCCACCCTAATGGCCTCATCnAACTAC 
CTATATCTGCAATGAIXCTAmCTGAACAATnCACAnGTGAGATTCTIITGGGnAGA 
ACTlJGAACATATGAATTraiTGGTGGTATATTmAnATAAGTlMCCC^ 

2,657 TGTGGGGTAAEATTGTGmACCAAGCACAAAGAAATGGAAAinGGGGATGTGTAACTC 
TGGAGAGCACAGATGACTAATCTAmAAMAGGGaCCAGGGGATnGATGAGGCCTG 
TGAATCniXACmrATTGCCTCTCmTmTGACAIICflTAAAGAAAAAAAATO 

2,837 ATATCCATGAACAGGTGCAGIXAAGGAGGKAGGramATGTGTCCAa^ 

CTOTAGCTCACAGGAATGATACTGATlXACTKniITGCTGCTCmGTAAACrGftTn 
CACATIXAnCTCTGGTAATCATCATCACAnraGTCATGAGGAAnAGimAAn^ 
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3, 017 TAGAGGAGAAAACTGGATCTGACATnCTCATCTCATnKTCTAT^^ 
CAAAAAATnOTGAGTCTTGCCCAAGACIXAmCAACTAATTAAK^^^ 
THGAmimilAGAAnAACmTACTGCAGCTCATGGATCAQ 

3,197 mAAACAAACAAAIMAAATlXTnGAraAimnGCTATAACAmim 
GTTIffllAGAAATTGAAAGrAAACnA™ 
AGTGIMiAAGCnAnXATCTTinGGAAnGATAG^^ 

3,377 AMTAATenACCATmGAATAAATCAAATGCTCTTCnnmnCATGGTAIinGC 
TGCnAAGmCTCTAACATGCCTGCAGTAAGTnCCAnAAGAATAGGAAAnAGGCTC 
GGTACAGTGGCTCAOITCTGTACTCnAGCAanGGGACACCGAGGAinGTGGA^ 

3,557 nMJTCAGGAGnCAAGACDMiCaGGimATGETGAAAimCATCTCTACTAA^ 
ATACAAAAAnAGTCAGGCATAGTGGCATGimTirrAATimGCTACTCGAGA^ 
AGGCAGGAGAATCACnGAACCAGGGAGATCn^AGGTrGCAGTGAGCCAAGATCATGCCAC 

3.737 TGCACTCCAGlXTGGGCGATAGAinGAGACTCTGTCTCAAAAAAAAAAAAAAAAAAA^ 
AAAAAGAAAAGAAAAAAAAAAGAAATTAGTATAnGTGATTATEnGAGGGAAAAGnAG 
TACCATAATATAAAAAGGTATGGACTATTGGAGAAAGnGTTTGCrrTGGTAACATTTAC 

3.917 TCATAGAAAGTATTnG&TAAAGCAGGACTCAGGGTGGTGGGGGAGGTGGGCAGTGAGGG 
ATAGGAHimTAAAAAlXATTlJrTinnGGAATimACACAATTAACl^ 
TraiAAGTHiACClTnAGGAAGATAACAmCTATIIATGAK^^ 

4. 097 CACAAGACAmATCTCAAGIMiATAGAGTCAAGATACTCTCACAAIXTCAGGGGCTK 
AACTCTAAATmCACATCCTGmACCCnGAATAGCTATinCAAGM 
TGTAACTTGnmTATmAAAGTACAmAACATCATCGGimAAATTAGATm^ 

4,277 TTTGGAGlTOTCIXnCTACTmGAmmATAAAATTnAAAATAK 

AGATAGTIiTTCAMTAIIATACAmiMCAnAAAAGTGnCAA™™™ 
CAGGTGTGCTGGCTCATGIXTGTAATimCATTnGGGAGGCCAAeGm 

4.457 CTTimCAAGAGmGAGACTAGinGGimATGGTGAGAAmCTCTCm 
AATACAAAAAnAAaGGGCATGGTGGTinGCACCTGTAAniXAGCTACnGGGAffi^ 
GAGGCAGGAGAATCACTTGAACaGGGAGGTGGAGGnGCAIITGAGCTGAGATTGC^ 

4,637 aGCACTtXAGKTGGATGACGGAGCAAGACTCTgTCTCCAA AAAAAAAAAA^^ 
' TGnCAATGTTTmAgATAnCACAGAGTCATGlMIATCACTATAAnGCTTaAG 
AACAnnCATCATimCAAAAGAAAGIXTrCinTAlBAnnAAnra 

4.817 TllAACTCiraiCAATTnGTAnCTAGAAATATrTTnACTAATATGCTAI^^ 
TTGTCATGCTGGTGAAAAGATGTGGTCTnCACCTGGATGCTnCTCAnAAGCAnATT 
mCTGmAGCTTlXTGTGTGAGlMCATTnCTCAGCnGATACTCAGTGCATCAGC 
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4,997 KCTT6i:A(MIAli/M:T(ICCTAaiCCTKTCT(na^^ 

miACTGAACWlTTIMTirrGGIXAAGaMnffiGACm 
ATACAAGAnTGGMlKmCTATGAACAAA/MiAATffi/W^W 

5,177 nACATTGATGATATGTrAAAHl/mTAmGAATATGnAGGnA/^^ 

TCAKTGmcnmACTTnAAAAATATGGCaCTGGAACAnTAAAACTCCCTATGT 
GGTTTGCTnGTGTnCTAnCGACAAAGnGGTCTAGACAGTACAAGGTGTGAAGACAC 

5,357 CGlXCTCniCTGGAGAAGATGCTGGAmnAmCACCTACAGGAAGAGACGTCTAAGT 
AGCAAnAGATGCTAAACTAATGCTGCCTCAGGAAAGAATCAAAAGAGAAAGACTGAAAC 
CAGGtmramTCACGIXTATAATCTCAGCNACmGGGAGGaA^^ 

5,537 GGATCA(BGGTCAGCAGAT{MiA(Xn(XTGGCTAACIXCGTCTCTATTAAAACTO 
AAAAAnAGCCGGGCATGGTGGCACGTGCCTGTAATTCCAGCTAGTCGGGAGGCTGACGC 
AGGAGAATAGCTTAAACCCAGGTGGCGGAGGGTGCAGTGAGCTGAGATGTGCCACAGCAC 

5,717 TCCA6CCTGG6CAACAGAGCCAGACTCTATCTCAAAATAAGGAAAAATAAAAAAGAAAAG 
AAAGAAAGTCCATAAATTGAGACTCCTAGAGATACTAAATGGTAGAATGGGAATTTGAAT 
nAAATnATAAGATGnCAGTCTCGGAGATCATAGGTCAnGniiTCCTCCTCCTTTTC 

5,897 ATGACAGGAACTAGCAATGAAGAGCTCTGACTATGTGCTAGSTACTACTCTGAGAACCTA 
ACATTTGTATCTrenAnAACTCTAnACTGimTCCTACAGATGAGAAAAnGAGG 
CACAGGAACTnAAGTTGGCCAAGATCACACAGCCAGTAAGGGGCAGACATTGAAAGGTC 

6, 077 ATmUlITGIITTATCCIXAGCDTCCAGGCAGTGGCAGAGnAGCTCATnTGGACAAA 
CAGCTCTCKAGAaAGACAnGTAAGCTATACTCAGGAATCATAGGAAAGAnATGATA 
GAATAATATATAGnACAAAGAAAAGAAAGAAAATCCAATGCGAGAATATnACTOTT 

6,257 CTATAnAAAGTGTnAATGTTTATGnTnAGAGGAATAnGTTTATTATAGCAAnTA 
GAAAACAAAATGAGAAAAAAAATIMAAAGAFCTAttTIXAGTTAnTTlTiT^^ 
(inilATTTTTliammCTGnTATATAAnGAAACTAnAnMTGCAAAm 

6,437 TAnCTGATTnCTCAGTrATTTTTATnATTmAAmTGTAAATAAACTTTTTTCTT 
CTGAGACAGTCTCGCTGTGTCACCOTCTGAAGTGimCGTGCAATCm^ 
T6ACCTCCCAGTCTCAAQCAATCTT(XTGa:TCA6(n:TaTAAGTAGCTGGGACT^ 

6,617 niTGTUCTAIXACATimTAATnnAAAATCTTTnTCTCTTnnGGTAGAAATGG 
' GAGTCTCTCTATGTTGfieAGlXTGTAnGAACTKTGGGCTIMnGATttTim^ 
CAGTCTCCCAAAGTGCTGGGATnCAGGCATGAGaAIXACACimCTAATrmAn 

6.797 nTATnTATTAAAAAAAAAAGTTmGCCTAramTCTCAimCTCACTCATTm 
. AAGTATAGGmnCTACAATGCCACAnGTCTnilXATAAAAAGTACmcaiCCGGG 
TGTGGTGGCTCACACClTiTAATimGCACmGGGAGACTGAGGlHiGTGGATim 
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6,977 AGBTCAGGftmiMiAIMaiTGGimATGIiTG/mXATra 

CAW«y\nWiCTGGATATGATTGTGGGimTGTCATIXCAGCTACTraAG^^^ 
CAGGAGAATCGCTTGAACCTGGGAGGCAGAGGTTGCAGTGAGCCOAGAnGTGCCACTGC 

7,157 ATCCCAGCCAGGGCAACAGAGCGAGACnCATGTCAAAAAAAAAAAAAAAAAGTACTTn 
CnCATTTGGnAGTATTCTCTTATGAGnGATGCCnCTAArrTATCTGAATGTTnCC 
ATTATmCTGCTGAGCnTAAAACTAlXCnCCTGACTTTIMiAATCCTAGACATGCT 

7,337 CCTTGTTGCTAGGTAAnAnAGnGCACTCAnAGAATAAAGTATATGCTTGGAGTGGG 
. GAGGAGATGAACTTmGAAGGGCGGTGAAGTATTTCTCACCACCAGGCCTTTGTCTTTG 
CTAAACTGAGGAAGGAAGATTTTATnCAnAGCTAACAAAGAACCTCCTATATAGGCCG 

7,517 GGCATGGTGGCTCAimTGTAATCCTCACATmCAGAGCimGTGGGTGCAnGCC 
TGAGCTCAGGAGnTGAGACCAGCnGGGCAACATGGCAAAACCCCATCTCTACTAAAAA 
TACTAAAAAnAGCTGGGCGTGCTGGTGAGTGCCTGTAATCCMTACTlXAGAGGG^ 

7,697 AGGCAGAAAAnGCTTGAACCCGGGAGGTGGAGGTTGCCATGAGCCGAGATCGTGACAGT 
GCACTIXAGCCTGGGimACAGIMACTCTCTCTCAATAAAATAAAATm 
AAAArAAATACATAAATAAATCTCCTATATAACCTCATAATATCAGATnGGAGCCTTTT 

7,877 (lATAGAAATGAAAnCAGAAGAAGCTGAGACTCAGATAnimGCTGCCTGGTGCTCT 
GTGAATAGAGGAGACTTGTTCnGTGAAATCTGAGTGCAAAGACACAGGACAAAnGnA 
TCTACTTnCATTCCTAAGGATACTGTATCGCCCTAAAACACAAGAACTAGAAnCTGTG 

8, 057 ATACCACGGGTACTCCACAGTGTGnCCnCCCCTTTCTGAACCTGATnGTCTCATCTC 
TATGAAAAGATGTGGGCTnGGGGTCAGATGTGGGTTGGAATCCTAGCGCCTGTETGGCT 
GCAAnTTCTTTTGTGTAAAATTGAGATAATAGTACAAAAGTAACAACAGTTAATATTAT 

8,237 CAAGTGCnACTCTGTGCCTEGCACTGTGnAAAnCTCTAAGTGTATmCTCATnAA 
TTTTTGTGATAGGCnAlTiACAaAnACTATCTTCATATTACAGTGAGGGTTD^^ 
GnAAGGTTIXATAACTAGTCAGCAGACCTGGGACTTCAinCCAGGCAGCTGAnCCAM 

8,417 GCCTAnCTAACTnAAACTGCTACTTTnGGACTGTTGTAAGAAGGAEAATnATATAA 
, AATGTTGGCACATAGTGGGTGCTGCTGnATATGAATGGGCACAAAATCTGTCTACATn 
TGlXTmACCAAATnAGAATCTAnTAGTTAAAACCTTCTTAGGGCGGCTGGAGTGCA 

8,597 GTTGCTCAnCCTGTAATCTCAGCACACTGGGAGGtXAAGGCAGGAGGAnGCnGAGCC 
CAGGTGTTTGAGAIXAGCCTGGGCACATAGTGAGAimCATCTCTIX 
AACAAACAAAAA(MA(mCTAGCTGGG(mGTGGTGimTGTAn(X^ 

8,777 TIMAGGCTGGGGTGGGAGAATGGCnGAGCCCAGGAGnCAAGGTTGCAGTGAGCTAT 
CATCACAGTACTGCACTlXAGCTTGGGCAGimTGAGAIXCTGTCTCGAAA^^ 
AAAAATAAAAACncnAGGACAGAGTGAnAGAAGCTCTCTAGTAGATACTTAGTAACA 

8,957 ATUTGGGnCCTCGGGCAG 
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, I GTTATlXTACGCAGAGAGCCAGinACCAATGGGTGCGCTCC 

AATIXMlACAmncrGCAAACTGCCniMiAAAAAaiA™ EXDN 2 

, CCAGGT6AATCCAACAAGATCCCCCGTCTGAGGACTGACCTnnCD| ' 

GTAAGTGGACm 

9. 137 TCTCTAAnAAnAAnAAnACTTATTTATnGAGACGGAGTnCACTTTTCnGCCCA 
GGCTGGAGHiCW^rGGCGCAATCTTAGCTCACTGlMCTIXGIXTIXTGGGnCAAGra 
AnCTCCTGCnCAGCCTCTGGAGCAGCTGGGAinCAGGCGCCTGCCACCATGCCCAGC 

9,317 TAATnTmiTTTTnTTTTTTTGAGACGGAGTCTCACTCTGnGCTCAGGCTGGAGTG 
CAGTGGCGCAATCTimTCACTGCAAGCTIXACCTCCTGGAnCACGIXAnCT™ 
CTCAGlITimGTAGCTGGGACTACAGGCACCCGIXACCACGCCCGGCTAATTTTm 

9,497 CTAnmAGTAGAGACGGGGTTICACCTTATTAGCCAGGATGGTCTCGATCTCCTGACC 
TAGTGATIXGIXmTTliACrrCCCAAAATGCTGGGAnACAGGCGTGAGCra 
CTGGCCTAATnmGTAnAnAGTAGAGACGGGCTnCATCATCTTGGCCAGGCTGGr 

9,677 CTCAAACTrcTGAlXTeAGGTGATCCACCCACCnGGKTimAAGTGnGGGATTACA 
AGCATGAGCCACTGTACCCGGCCTTnCTCTAATTTTAAAGTGTCTGTAATTTCACAACC 
TCTraCACAGATGTGGGAGTGTnnCTTCAAGCTGT(XA6AGTGTmGCn(m 

9,857 CTTGCnTGGTAGTTTGGCTCnACTCTGCAGTACATGGTAAAAGTGTACTGTATATACT 
GGCATArGACAmCGAGTATACATGAmArcTATGTTmGAAATTTTTm^^^ 
WAGAGAGGAGCATMGGACTnTCATCAACAGGTAnGAAAATGATTGAACATTGT 

1 0, 037 TnATTlliTFAAACAGAACACACTATATATAAAAATCCAATAATTAACTGAA™ 
GCAAAATGTGGTATAAGCATACAAAGGAATATTAnGGGTCATAAAAAGAATGAAGIACT 
GATACATGCTACAACATAGATAAACCnGGAAACAnATGCAGAGCGAAGGAAGGCCAGA 

10,217 imAAAAGimTAnGrATGAniBmAGATGAAATGTIXAGAATAGGC^^ 
CTAGAGGCAGAAAGTAGAnAGTGGGnACAGGGGCTGGGGAAAGGGAGGAATAAGGAGT 
GACTGCTAAraHiTATGAGGGTmTmGGAGGAGGTGATTAAAATGHCnCTGCCAG 

10,397 GTGraiTGGCTCATCCeTCTAATimGCACTnEGGAGGCCGAGGCGGGAGGAnGTrT 
' GAGinXAGGAGmGAGGlOTCTGGGIMATACTGAGAaCTATCTCTAmiMT 
ACATnmATAnAAAAAAATGnCnCAAGTAGTTGGTAAnATTTTTAAAAATGGCC 

10,577 AGGTGCAGAGGCTCATGTCTGTAATinXAGCAaTTGGGAGGCTGAGGTGGGAGGATCCC 
TTGAGCCCAGGAGGmGGGAaAGIITGGGttATACAGCAAGACCCTGTC^^^^ 
AATACAAAAAnAGCTAGGCATAGTGATGTGCACCTGTGGICCCAGCTACTCGGGAGGCT 
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10,757 GAGIHAGGAGMTCTCnGAGIXTATGnGAGlXTGmiMXCTAnTATGl^ 
CACTIXAGlXTAGGCMAmAAGACTCTCTCTCAAAAMAGAAAAAAAA^ 
AAAGIMIAATAAnAGClMiACnBTAAAACAAAAATCAAATCTCTTCTmGATC^^ 

10,937 AT/WW«:TTGCmAMnGCAAAAAAllAIXTGATATAAAnCATAACTAACA^ 
TtMATTATAnAGAAAaAnAAnCAATCAATACTAAAKTATinAGGAM^^ 
TATACATAnAAGAAAAKiAlTATCATAAAAGTTnAATCTCCAGGCTCAAACCTAGAM 

11,117 ATCACTCTCCTCAAAimGGGnAATCATCATGaiXAAAimCTACAmCAW^^ 
ClTTGGGATCCTGGCAACTTTCTCTnTGTTTnTTTnTTTnTGAGACAGGGTCTCCT 
CTCTCAlXCAGGATOiAGTGCAGTGGTinGATl^^TAGCTCACTGCAGCCTGCAAa^ 

11,297 AAGTGATCniXTGCCTCAGCCTC^^ 

CATCTAAnAAAAAAATTTTTTTTTGTAGAGACAGGGGTCTCTGTACATTTCCCAGGCTG 
GTCATGTACnXTAAGCTCAAGCAGTaTinAaTCAGIXTIXCAAAGTGIXGG 

11.477 CAGTCAniAGCCAttAnCCCAGCGCTGGTGACTnCTCCATCACTGGTGACTTTCTCCA 
TCACTGGTAnCACTGCAnAGTGATGACATCATTACAATCTTCAATATGCAACTnGTA 
GTCCTACTCTTGCAncnACmAAAGCCTCAGCATTAAGnTGAATGTAATAnACAG 

11,657 CATCCnCAnACmAAATCAnGGmCAATAGTAATTCATnAAATCTAAAATGnA 
GGCTGCAGTGGCTCATGCCTGTAATimCIOTTGGGAGACTGAGGTGGGAGAATC^ 
TTCAGGIXAAGAAnTGAGAIXAGinGGGCAACACGGCAAGACCCCATCTCTAAAA^ 

11,837 AGTGGimGimTCTGrcTCAimGTAATCClMACmGGCAGGCCGAGG™ 
TAGCnGAGETMACTTlMiATCAGIXTGGimATGGCGGAAAIXCAmCT^ 
AAAAATACAAAAATTAGC7GGGCATGGTGGCACGCCTGTAATCCCAGCTATTGAGAGGCC 

12,017 GAGGCAGGCAmGGGAGGCCAAGGCAGGCAGATTGanGAGACCTGCCTGGGTAACA 
TGGAGAAATlITGTCTCTACAGAAAAATAIMAAmGTOGCATGGAGAAACCT^^ 
CTCTATAGAAAGACACAAAAACTAGCCATGCATGCCTGTGGTCCAGCTACTCGAAAGGCT 

12,197 GAGATGGGAGGAnGCTTGATCCTGAGAGGTCAAGGCTGAACTGAGCCATGGTGTGGCAC 
TGCACTimimTGACAGACTAAAACCTTGTCTCAAAATAAATAAACACAmA^ 
ATAAArAAATACAATTAAAACTAAAAnAAAAAATAAAATAAAATdTTAAGAGAATAGCT 

12. 377 CAAAnCTKAAAAGAACTCTTGCACACCAnCCTCCTCnCTCAAATCTCTAimC^^ 
TOlXAAAGIXAGTAACTGCTTCTCAIXCTGACCCTinGCmm 
AAGAATGGTlXTTGCnCTGTGCTGATIICAAAIXCTnTGracm 

12,557 TTCCCTTOCTQCTCT6TAnGGCTGTGGGGTGGGGGTGGCGGTGGAACTGAC(ICTC^ 
GGTCTGCATnCTCAGGCTCCCAGGGCTGTGGCTGACnTGGCCAATGGGAGGCAAGG^ 
GGGAGACniACAGCTTGGGAGGAAGGGAGAGAGGTATCmcnCTCCnACTCCCTGCC 
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12,737 TGGGGTffiC/mTGGGCAGGACTCTGrrnamTGGCCTCAGCTlX^ 
TCT/mCTGaiCTCAGGAAATAGACAAKTCCnCCACTATCGCTGTAGCCC^ 
GGAACTATTTmnCTncmCTncmiXTnTTTTTTTTTACAGACrCTCACTCT 

12,917 TGTTGCIXAGGCTGGACTGCACTGGTGCGATCTCAKTCACTGimTCTGIX^ 

GnCAACTGAnCTKTGIirCAGrcTCCAGGGTGTGCCACTATGCCCAGCTAATTTTTG 
TATTmGGTAGAGACGGGirnnGttATlJrGAIXAGGCTGGTCTTGAACnaCAra 

13, W CAAATGATClTOTGIXTanCCTIXCAAAGTGCTGGGATTACAGGCATGAGCCACTGTG 
(XTGGCCGAAGGAAATAnnCTTGCTATTGCTAATCTCTGGGTTACCTCGCTATCCCCC 
ATnAGCTTCACTTCTCCTCCATCACCTGTATGAGGAATTCCCTCTGTGTTAAATATCTG 

13,277 GAGAAGm(ITGAn6GA(XCTGGCTGnG(Mn(mG(XACCTCTCmGIE^ 
TGGTAltCTITTIXCATGCATCTTCTCCAGGACnilATTCTGCACT^^ 
TCAGTGTCncnCCCATCAGTATAGGGGTGGACniAGTATCTCCTATGTnAGGIM 

13,457 ATCTCTanTGACETGOTCTTCTCCAGTGGnGCCCTTCTCTGCTCCTCTTCACAAT 
AACAIXT(XlBAAGGGlXAIXCATG(XTG(mCT(XTn(ITCA^^ 
GGACnCTGTTKTACACTIXAlXCTGGnGAimGTCACTGAmCTlCTCTATm^ 

13,637 AGCTTACTTGATKTTAAncaTTlMAACAGCTAACTGGGCCATGCATGTAAT^ 
GCACnCGGGAGGOMmGGAGGATCACnGAGIXCAGGAGTTCAGGA^ 
CTGGGCAACATAGTGAGACCCTATCTACAAAAAATAGAAAAAmGimGIMGTGA^ 

13,817 TCATGCTTGTGGTCCCAGCTACAAAGGAAGCTGAGGTGGGAGGATGGCnGAGTCCGGGA 
GGGIGAGGCTGCACTTGAGIXATGATCACGimTGCACTCCAGIXTGi^ 
ECCCmCTGTAAAAAAAAAAATGGnGAimTlXnCCTTGAAATGCTTTTTTCnG 

13,997 AGGCnCCATGlICTGIXnATlXTGmcnrcTACnCTCTGGnGTGCTTTrTlXTC 
TGCTCAGrATnAACATGTTGGTIiTGAinTGGCTCTGGCCTGGGIX^ 
AlMiCTnCTCTCGAMCTIIATCGGnGCATGGGmAAaACCAAATCTGTGAn 

14.177 CTAGCTlMAramCTAAAGTAGCimTGin^CACTimAnACAnceC^^ 
nCATnCTCTGTGCCACCTATGATTnCCTGATnAmAnCACTTTTCAnGTCTGT 
CTTimACTAAAATAAAAACTTCnGAGAAGGGGCnCATIBTCTGIXTCTGnD^^^ 

14,357 CimGIICTCAAAACAAGGAlIAGATAnCAACAAATAmAnGAAT^^^ 
' nAAAACrCTAATTGGTTGTATGCTGGTGGrrTAnATTTTCATGGAGGAAATGACTTCT 
AGGCTGTGACACTCAGCmTGTCTCTGATGCTTTGnGCCCTGTTCTGTCACCGAGGGC 

14,537 TGTCGTCAnGCTCTGGIXATmGTGDTITrTGAAmCTAATCATCACACTCAAm 
GAAGGCAGIITTACCTnCAGCACTCTTIMTGAATGAGTGlMTTGGAGGm 
ATTTTnGATAGGAAAnGAATGTnATATGCTGGTAAATATAAAGCnAGCTTTnACA 
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14.717 AAGMnTCTCAAAAinmTniJriMiCCCTmnGnAIMTmATm 
ATnTAATnAGGAAAAAATGICATCTGmGGGCTGACTTAIJrGTTAirrTGTnGTCC 
mcnTTTTnGGTGGAGGGTATGGAGTTnGCTCTTGTAACCCAGGCTGGAGTGCAGT 

14,897 GGCGCGATCTOCTCACTGCAAaiCCGlXTIXTGGGTrCAAGmTCTCTCACCTCA 
GKTTCCGAGTAGCTGGGAnACAGGCATGCACCACCACACTTGGCTAATTTnGTATTT 
TAAGTAGAGACCGGGTnCACTATGnGGTCAGGCTGGnTCGAACTCCTGACCTCAACT 

15, 077 GATCAimCCTTGGCCTCCCAAAGTGCTTGGATTACAGACATGAGCCACCACACCCGGC 
CAAGAGGACTTCTTnAAAAATGATTTCTTGGGCCGGGTGCAinGGCTCACACCTGTAAT 
CCCAGCACTTTGGGAGGCTGAGGTGGGTGGTTCACAAGGTCAGGAGTnGAGATCAGCCT 

15,257 GGmTATGGTGAAACTlIATCTCTACTAAAAATACAAAAAnAGCimATGCT^^^ 
GCACCCCTGTTGTCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCACTTGAACCTBGGA 
GGTGGAGGTTGCAGTGAGC(MIATGGCA(n:ACTGCACTCCAGCCTGGGCAACAGAGCAA 

15,437 GACTCTGCCTCCAAAAATAAAAAnAAAATGAnTCnAAGTAAAmCAAATATAGAAT 
GTATATGCTAGTGATAACAAAAnAACACTCnTATGCAAGTCTGCAATAGGTAGATinG 
AAGnGATAGGTGCAATAAGTATAGGimACATAGGAACATnGAIXTGTTnmCT 

15,617 TGAnnAAAACATTGAATAAnCGGAAGCTrnAAATCTCTTAAnTGAGCAACTAGAT 
GGCTGTAmATCTCCTTATATTAAAAAAACTATTATAAnATCrrTCCCACATATCAAA 
CTaACTGGTTTTmCCCATTTTTCTnCATACTTCAG 

AAAGACGAGMTCCAGGACTT j 

15,797 iGAATCGTATCnCCCACmCTGAGGACTACTCTGGATCAGGCnCGGCTm 
CTCTGGATCAGGATCTGGGACTGGCnirfAACGGAAATGGAACAGGAnATOC 
AGACGAAAGTGATGCmraGACAACCTTAGGTaCTTGACAGGAATCTGCTCTCAGA 

15,977 CAGCCAGGAaTGGGTCAACATGGAnAGAAGAGGATTTTATGnATAAAAGAGGATTTr 

ClIACCTTGACACCAGGCAATGTAGnAGCATAnnATGTACCATGGnATATGATTAA ™ ^ 
TCTTGGGACAAAGAAnnATAGAAATTnTAAACATCTGAAAAAGAAGCTTAAETnTA 

16, 157 TCATCCTnTTmCTCATGAATTCnAAAGGATTATGCTnAATGCTGnATCTATCTT 
AnGnCTTGAAAATAttTGMTTTmGGTATCATGnCAACCAACATCAnAT^ 
TAAnAGAnCCCATGGCCATAAAATGGCTnAAAGAATATATATATATnnAAAOTAG 

16,337 CTlTlAGAAGIMTTGGIMTAATAmCATAIXTAAAnAAGACTCTGAC^^ 

TGAAHATAATGATAIliimnTTCTTATAAAAACAAAAAAAAAATAATGAAACACAGT 
GAATrTGTAGAGTGGGGGTATnGACATATTTTACAGGGTGGAGTGTACTATATACIAn 

16,517 ACCnTGAATGTGmCCAGAGCTAGTGGATGTETTTGTCTACAAGTATGAnGCTGnA 
CATAACATOAAAnAAaiXlMTTAAAACACAGTTGTGCTGTCAATACCTCATACT 
GCTnACCTT TTTnCCTGGATATCTdTGTAnTTCAAATGnACTATATAnAAAGCAG | 

16,697 lAAATATAACC | 
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-504 MnCTAGCAllACTCTGGAC(rrTAAaiG(miHTCA7CCT(SGGGCTQ^^ 
-429 CCCTGCTTGTGlXTGACTCTGTGCa:GC(XA6CnCTCTnGATGTGCGCTGTGGATGAGaGAK 
-354 CAACAGCTGAGTCCTCCTGTCTGnTAGATTGTTACCTGAAGGAAGGGAGGGGGAAGAAAGTGCTGAnCGACn 
-279 mGATGBGGAAAACnnmnAAACATGCAAATGACAGATGGCAGAGCTTTnGGAAAAAGAAAAAATAATA 
-204 ACCACACAGimGCCTAGGGGGAGTIXGGTGGAGTnCATCATGGGTATGAACAGninTGTTT^ 
-129 ncncnCinCTGGGTGnGATGTGGATCTCnTCTAmGTTIMAAACTirrGACGTGTGncnGGGI^^ 

exon 

-54 GGTCTGAGGmTGGAA(XTCTnCTAAAAGGGACAGAAAGABm7GCTACATnGCTAATlIAGAGGCTGA 

START \ 

GTGGAGCCGAGCTGGTCAGG I ATG CAG GH CCC 6TC GGC AGC AGG CTT GTC CTG GCT CTC 

MET GLN VAL PRD VAL GLY SER ARG LEU VAL LEU ALA LEU 
exon 1-j 



GCC TTC GTC CTG GTT TGG GGA TCT TCA GTG CAA Ggt aagagacccftggatctttaatic- 
ALA PHE VAL LEU VAL TRP GLY SER SER VAL GLNCaT) 



|-exon 2 

-( 8 kb)- ggttccttgttcgcaca gGT TAT CCT GCT CGG AGA GCC AGG TAC GAG TGG ^TC 

(GLY) TYR PRD ALA ARG ARG ALA ARG TYR GUJ TRP VAL 

CGC TGC AAA CCG AAT GGC TTT TTT GOT AAC TGC ATC GAG GAG AAG GGA CCA CAG TTT 
ARG CYS LYS PRD ASN GLY PHE PHE ALA ASN CYS ILE GLU GLU LYS GLY PRD GLN PfC 

exon 2H 

GAC CTA ATA GAT GAA TCC AAT AAC ATC GGC CCT CCC ATG AAT AAT CCT GH Hg iaa 
ASP LEU Ilf ASP GLU SER ASN ASN ILE GLY PRD PRD YB ASN ASN PRD VAL(LEU) 

|- exon 3 

gtagactticatcgo-t -( 4 l<b)-ttmicittgiatttt agG ATG GAA GGA CCC TCA AAA GAT 

(LEU)ICT GLU GLY PRO SER LYS ASP 
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TTC ATC TCC AAT TAT 
PHE III S£R ASN TYR 



GAT GAC 
ASP ASP 


TAT 
TYR 







GGG TCA GGT TCG GGC TCC G6C TCT GGC TCC GGC 
GLY SER aY SER GLY SER GLY SER GLY SER GLY 



TCT GGC TCG GGT TCC GGC TCC GGA AGT GGC 
SER GLY SER GLY SER GLY SER GLY SER GLY 



TTC CTA GGT GAC ATG GAA TCC GAA TAC 
PHE LEU GLY ASP MET GLU TRP GLU TYR 



CAG CCA ACA GAT GAA AGC AAT ATT GTC TAT HC AAC TAT AAG CCT TTT GAC AGG AH 
GLN PRD THR ASP GLU SER ASN ILE VAL TYR PHE ASN TYR LYS PRO PfC ASP ARG ILE 

CTC ACT GAG CAA AAC CAA GAC CAA CCA GAA GAC GAT HT AH ATA TGA 
LEU THR GLU aN ASN GLN ASP GLN PRD GLU ASP ASP PHE ILE ILE STDP 

ATGTGACGGTCTCTGTCTCCCCACCTCCATGTGGAACAATGTAnCAGTATACUAGTGTACCACGTTTAAA^^ 

CCAGTCTCAGGATAAAGAGinTACAGAAAATTTAAAATGCCTGGAAAAGACTCnGAATCCTGTTACCCCTnC 

CTCAnAACTCGTAAGGAAnATGCTnAATGCTGnACCTATCTTGTTCTTCTGGAAAATGCCTGCATnATGT 

GTAnGAATCAACATnAAGAAAnAACACACACCCCCAnAnATACAATAACTnCAAAGCCATACTGGTTTT 

GAAAATnTAATnGATAGCAAGnGATGAACAATCTnCATACCTAAAGTmAGGAACCCAACTCGCAHGT 

GAAnACAAATATATTCCTHATGTGATTAAAAAGAAAATAAAGTG 
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CONSTRUCT 



pPG(-504/+24)hGH 
pPG{-423/+24)hGH 
pPG(-333/+24)hGH 
pPG(-250/f24)hGH 

pPG{-190/f24)hGH 
pPG(-118/f24)hGH 

pPG(-B1/+24)hCH 

pPG(-63/+24)hGH 

pPG(-40/+24)hGH 

pPG{-20/-8-24)hGH 



-504, 



-423. 



-333 1 



-250. 



-190, 
-118, 
-81 



-53^ 
-40, 

-20, 



fmm. hGH EXPRESSION 


DDI 1 


Rot-l 


CELi 


RfiROBUST 


0.89 ±0.15 




(18) 


(18) 


0.82 ±0.13 


ND 


(6) 




0.79 ±0.11 


0.03±0i)l 


(6) 


(6) 


1.24 ±QJ3 


0.05 ±0.02 




(8/ 


L52±0.48 


1.06 ±0.27 


(8) 


(8) 


i21 ± 1.61 


1.21 ±0.65 


(14) 


(14) 


0.46 ±0.16 


0.22 ±0.12 


(10) 


(10) 


034 ±0.12 


0.20 ±0.10 


(B) 


(8) 


0.41 ±0.12 




(8) 


(8) 


0.00 


0.00 


(8) 


(8) 



504 bp 5' FIANKINC REGION 

OF THE MOUSE SG-PG GENE hGH GENE 



t+24 



TRANSCRlPnON 
INimTION SITE 
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RBL-1 CELLS 
1 2 3 4 5 6 





Rof-f FIBROBLASTS 
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-b6H mRNA 
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Rat-1 
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I II 1 
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